登录查看更多内容

Computational Time-Lapse

Jonathan Hilgart

Machine Learning, LLMs, and Information Retrieval

发布日期: 2018年10月28日

Whenever you've seen a time-lapse video, it probably looked fairly 'jumpy' from one frame to the next with objects and people 'popping' in and out of the field of vision. This is because the goal of time lapse is to take a long video and compress it down to a more digestible form (perfect for our limited attention spans!). In order to do that, most time-lapse videos sample every couple of frames (i.e. every 10-15 frames) in order to compress the original video. As shown below, the streets of NY are filmed over several hours and then compressed into a 2:42 video which gives the viewer a nice overall summary.

However, what if there was a way to compress this video into the same length but also preserve as much motion as possible? Put another way, can we remove the choppiness as shown in the NYC time lapse above? Why would you want to do that? Maybe you're creating a time-lapse of someone cooking in the kitchen. The person is in and out of the kitchen, grabbing supplies and other ingredients necessary. It would be great if an algorithm could cut out all of the video without motion, i.e. when no one was in the kitchen cooking, and focused on preserving all of the frames where someone was actively moving in the video.

This was the question that was tackled by Eric Bennett and Leonard McMillan in their paper Computational Time-Lapse Video. The authors explored two avenues to approach time-lapse videos. The first, a non-uniform sampling method to maximize the user's visual objective. In this case, my objective was to maximize the amount of motion present between frames. The second approach, which I did not focus on implementing, was a method to create a virtual shutter that extends the exposure time of time-lapse videos. This is a way to create photos similar to the cover photo for this blog post.

In order to create a motion-saving video, you need to define an objective function to maximize. Therefore, the authors define the following min-error metric which computes the cost of jumping from initial frame i to frame j. Where Aij^(xy) is the y-intercept for each pixel and Bij^(xy) is the slope for each pixel. The gist is that we are summing up the amount of 'motion' that we are missing in between each pair of frames that we jump to. For example, if we jump from frame 1 to from 5, our error is the summation of the metric below for frames 1 to 5.

In order to solve the problem of which frames to choose, the authors implement a dynamic-programming technique outlined below. This approach, function D(s, M), results in the optimal sampling set v that is the min-error reconstruction of your set of frames s. In order to solve this, I created a matrix D with rows equal to the length of s, which is the number of input frames I had and columns equal to M, which was the number of frames I wanted to sample to create this time lapse video from. Then, for each column, i.e. for each frame I wanted to sample, I picked the entry in my matrix D that had the lowest cost. At the end, I had a series of frames that minimized the given error metric (min-error in this case as described above).

You can see my results of this implementation in the video below.

All of the code for this assignment lives on my Github.

Thanks for reading!

要查看或添加评论，请登录

Jonathan Hilgart的更多文章

Leading, and Pairing on, ML projects

2021年12月28日

Leading, and Pairing on, ML projects

Most machine learning projects sit squarely in the intersection between "spend two years on this and get back to me"…
Testing Machine Learning Systems

2021年8月10日

Testing Machine Learning Systems

Testing in software development is almost a science at this point. You’ve probably heard the quip, “Write tests, some…
Real-time audio processing on a Pi

2021年7月3日

Real-time audio processing on a Pi

Over the past couple of days, I've been working on visualizing the frequency spectrum of audio. My goal was to wire up…

2 条评论
Good Ol' Algorithms

2021年5月14日

Good Ol' Algorithms

Did you know it is estimated people make 35,000 decisions per day! Most of these decisions are binary, but some of…

1 条评论
Primitive functions and the bias they introduce

2020年12月28日

Primitive functions and the bias they introduce

Noah Chomsky has a theory that humans have an innate ability to learn language. This ability to learn a language is…

3 条评论
FHIR is Fire

2020年8月1日

FHIR is Fire

You remember the last time you texted an Android phone (yes, I’m assuming you’re an iPhone user) and a green bubble…

2 条评论
How Computers Learn to See

2020年1月2日

How Computers Learn to See

Professional wide receivers have been known to catch a football with their eyes closed. Hockey goalies predict where…
Information Security & Social Inclusion

2019年6月12日

Information Security & Social Inclusion

I’m currently reading through the book Computer Security: Principles and Practice which discusses security protocols…

1 条评论
Inductive Bias for Different Machine Learning Approaches

2019年3月3日

Inductive Bias for Different Machine Learning Approaches

Source - https://www.researchgate.
How to Teach Computers to be `Intelligent`

2019年1月7日

How to Teach Computers to be `Intelligent`

Does this mean anything to you? If this were a Rorschach test, i.e.

See all articles

Computational Time-Lapse

Jonathan Hilgart

Machine Learning, LLMs, and Information Retrieval

Jonathan Hilgart的更多文章

社区洞察

其他会员也浏览了

New Machine Learning Optimization Technique - Part I

What is the Hull White model? (convo with Bing Copilot)

How do we evaluate the Multimodal Models for key enterprise tasks?

Introducing a Novel Approach in Feature Selection Convergence

What is the difference between a PROP, an SVAR and a NOM? (convo w/Perplexity.AI)

Topic 20: What is Flow Matching?

Focusing on What to Focus on in 2025

Understanding Vector Norms: A Comprehensive Guide to L1, L2, L∞, and Beyond...

To RAG or not to RAG: That is the question.

Model Fine-Tuning

Jonathan Hilgart的更多文章

Leading, and Pairing on, ML projects

Testing Machine Learning Systems

Real-time audio processing on a Pi

Good Ol' Algorithms

Primitive functions and the bias they introduce

FHIR is Fire

How Computers Learn to See

Information Security & Social Inclusion

Inductive Bias for Different Machine Learning Approaches

How to Teach Computers to be `Intelligent`

社区洞察

其他会员也浏览了

New Machine Learning Optimization Technique - Part I

What is the Hull White model? (convo with Bing Copilot)

How do we evaluate the Multimodal Models for key enterprise tasks?

Introducing a Novel Approach in Feature Selection Convergence

What is the difference between a PROP, an SVAR and a NOM? (convo w/Perplexity.AI)

Topic 20: What is Flow Matching?

Focusing on What to Focus on in 2025

Understanding Vector Norms: A Comprehensive Guide to L1, L2, L∞, and Beyond...

To RAG or not to RAG: That is the question.

Model Fine-Tuning