Use Amped FIVE to Sync Videos Leveraging Their Audio Tracks!

Table of Contents

Reading time: 5 min

use amped five to sync videos leveraging their audio tracks

Dear friends welcome to this week’s tip! Since the latest Amped FIVE release, users can deal with audio much better than before. Today, we’re demonstrating that in some cases you can effectively use the audio waveform as a guide to sync in time two video tracks. Keep reading to find out more!

Case Study

Let’s start with a case study: we have two video tracks, captured by two different devices, that depicts two different scenes (one outdoor and one indoor) at roughly the same moments in time.

Screenshot of Amped FIVE software displaying synchronized video footage from two sources labeled “Outdoor” and “Indoor.” The top viewer shows an exterior view of residential buildings with cars parked and overcast weather, while the bottom viewer displays an interior scene with a computer monitor, desk lamp, and office supplies. The interface includes tools for zoom, filters, timeline navigation, and video engine settings. This forensic video analysis environment is used for comparative evaluation of surveillance footage from multiple camera angles.

You can download them by clicking on the buttons below (10 MB total) or watch them.

Unfortunately, however, they are not synched, and we don’t know the amount of delay between the two. Our goal is to create a collage showing the two video streams, synched in time. We can assume that both videos have the same, constant frame rate.

Since there is no visual element that is common to the two recordings (e.g., a flashlight, or something/someone passing by in both scenes), there’s no chance we can sync the videos based on what we see. But, don’t forget, videos often come with audio tracks! And, luckily, sound propagates much better than light in some situations! That is to say, there could be some distinguishable sound that reached both cameras’ microphones, allowing us to sync the tracks!

Let’s make sure we’ve loaded both videos with the “FFMS With Audio” engine:

Screenshot of the Video Loader filter settings panel in Amped FIVE software, showing video file path, video engine setting, and color range options. A red arrow points to the "Video Engine" dropdown menu set to "FFMS with Audio." Additional options include converting the DVR and applying settings. This interface assists forensic analysts in configuring video playback and compatibility settings for CCTV or surveillance footage analysis.

Then, as explained in a recent tip, we can go with the mouse over the Player panel, hold SHIFT and scroll the mouse wheel to zoom the audio track vertically. We do this for both videos, and we notice that there’s a spike in both the audio tracks!

Screenshot comparison of two synchronized audio waveforms in Amped FIVE software, showing an outdoor and an indoor surveillance video timeline. The top waveform represents the outdoor video with audio peaks around frame 600, while the bottom waveform represents the indoor video with a delayed corresponding peak around frame 500. A red dashed arrow indicates the correlation between the sound events in both videos, aiding forensic video analysts in multi-source synchronization.

It’s definitely time to listen to both audios to get confirmation that the spikes could indeed be related to the same moment in time. Audio inspection reveals that… yes! There’s a strong “knock” that we hear in both videos. After that sound, the Indoor scene slightly moves, suggesting that the camera was somewhat influenced by the event.

Assuming we don’t have to worry about strange/variable framerates, it makes sense to use this shared sound like a landmark to guide the synching process. Let’s head to the Link filter category and select the Multiview filter (if you’ve never used it, take a look at this past tip).

Since we only have two videos to combine, we configure the Multiview filter to arrange them on 2 rows and 1 column, setting such values in the Inputs panel:

Amped FIVE software interface showing a multiview video layout with synchronized outdoor and indoor surveillance footage. The top frame displays an exterior view of a residential building, while the bottom frame shows an interior room with a desk, monitor, and curtain. The Multiview filter settings panel is visible on the right, configured with two video inputs for forensic comparison. This tool is used for timeline analysis, correlation, and verification of events from multiple camera perspectives.

Sync Videos

Let’s now sync the videos. When you combine videos that have an audio track, the Multiview filter lets you choose which one you want to use as the audio source. Let’s click on the Output panel and select the Outdoor scene as the audio source.

Amped FIVE Multiview filter output settings panel showing configuration options for combining video streams. The dropdown menu under "Audio Source" is expanded, offering choices between "No Audio", "Outdoor - Video Loader - 8" and "Indoor - Video Loader - 9". Additional options include output size, mode, and interpolation type for video export.

Now we go back to the Inputs panel, and we double click on the Outdoor input track. This will allow us to seek that specific track while leaving the other still. You’ll see the selected track will show a “play” symbol close to its name:

Amped FIVE Multiview filter input settings panel with "Outdoor - Video Loader - 8" highlighted for selection. An arrow points to the selected video source, indicating configuration for video stream alignment or synchronization. The panel includes options for setting input columns, adding or removing video sources, and reordering them.

And now, we drag the player cursor to the very beginning of the audio spike. Remember you can zoom horizontally (CTRL + mouse wheel) and vertically (SHIFT + mouse wheel) in the audio panel to make it well visible!

Screenshot of Amped FIVE software displaying synchronized indoor and outdoor surveillance video footage in a Multiview layout. The interface shows the timeline with a red arrow pointing to a selected audio waveform peak at frame 581, indicating a key event used for alignment. The right panel includes Multiview filter settings with inputs from two video loaders and an option to adjust input delay. This setup demonstrates forensic audio-video synchronization for multi-camera investigations.

Ok, this track is done. Let’s now work on the second one. We go to the Output panel and select the Indoor scene as the audio source. Then, double click on it in the Inputs panel to seek it. Place the player cursor at the beginning of the waveform spike.

Screenshot of Amped FIVE software showing synchronized playback of outdoor and indoor surveillance video feeds in a Multiview layout. A red arrow highlights a peak in the indoor video's audio waveform at frame 521, indicating a sound event used for synchronization. The right-side panel displays input settings with a 60-frame delay applied to the indoor video. This scene demonstrates precise forensic audio-video alignment for investigative analysis.

And we’re done! If you look at the Multiview filter settings panel, you’ll see that the Input Delay has been set to 0/60. It means that, based on our synching operations, the Indoor track has a delay of 60 frames compared to the Outdoor track.

Input delay in Multiview filter settings set to 0/60

Outdoor Scene

We can select the Outdoor scene as the audio source. Click on the Seek all button on the Multiview filter’s Input parameter panel, and we’ll see both videos playing.

Notice that, just when we hear the “knock” in the Outdoor video track, the camera moves in the Indoor video. That means we synched videos the right way! Of course, remember this was just a tip. Video timing is a delicate topic that requires careful evaluation. Metadata in the container remains the most valuable source of information (when you have it!).

Conclusion

Updated on April 1, 2020

One of our attentive users noticed that for synchronization purposes. We should take into consideration also the distances between the two recording devices since sound travels much slower than light. In our specific example, the two videos were taken from a few meters apart. So, this shouldn’t matter much. However, if you use this technique in an actual case, you must be aware of this and estimate the possible error rate. Thanks for reading our posts with a critical attitude!

Table of Contents

Share on

Subscribe to our Blog

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Subscribe to our Blog

Related posts