Dear friends, welcome to this week’s tip! It’s a rather special one: longer, but also very exciting. We’ll see how the latest Amped Authenticate update enables you to carry out video source identification when only images are available as reference, and vice-versa! We’ll provide an introduction to the problem and then go practical with a step-by-step example available for everyone. Keep reading!
Amped Authenticate Update 16636
Less than a week ago Amped Authenticate Update 16636 was released. The most exciting news was dedicated to the PRNU Identification algorithm. A new, advanced search is now available that will detect matching images even in the presence of combinations of scaling and crop (e.g., that happens when you use digital zoom). But this improvement is the key to unlock another exciting application: hybrid source identification. That is, investigating the originating device of a digital video using images as reference (and vice-versa).
It is well known that smartphones are quickly eating up the market of image and video production. One key feature of these devices is their ability to produce both images and videos (plus a variety of ‘fancy’ shooting modes).
Duality
Before we continue, it is worth taking a look at how this “duality” is generally handled by devices. But yet before, let’s get some data to test what we’ll be talking about.
Luckily, the publicly available VISION dataset provides all we need. We’ll deal with a case where we need to attribute one image and one video. We’ve prepared a ZIP file with a selection of some of the files available in VISION. You can download it here.
This is what we have inside:
HybridTest\
questioned\
questioned image.jpg
questioned video.mp4
reference\
D01\
images\ (50 JPEG images inside)
video\D01_V_flat_move_0002.mp4
D17\
images\ (50 JPEG images inside)
video\D17_V_flat_move_0002.mp4
stabilized_video\
D06_V_outdoor_move_0001.mov
We are given two main tasks:
- Test whether questioned image.jpg was captured with D01 or D17, using the reference videos to carry out the test.
- Test whether questioned video.mp4 was captured with D01 or D17, using the reference images to carry out the test.
As useful contextual information, we know that the reference device D01 is a Samsung Galaxy S3 Mini while D17 is a Microsoft_Lumia 640 LTE.
The relationship between full-frame images and video frames
When we take a picture, the device normally uses the full sensor to capture it (some devices may actually leave a small border unused). Smartphone sensors typically have a 4:3 aspect ratio, which is reflected in the resolution of the produced image (e.g., modern Apple smartphones produce 4032 x 3024 images). When we capture a video, there are mainly two differences:
- The aspect ratio is 16:9 (well, at least the default one)
- The resolution is lower: 3840 x 2160 when the highest “4k” output resolution is selected, 1920 x 1080 when the more portable Full-HD output resolution is selected, or even smaller (1280 x 720 and so on).
How do smartphones compensate for these two differences? As shown in the recent scientific literature, this is the usual strategy: first, only a subpart of the sensor is selected, so that the aspect ratio becomes 16:9. Then, these acquired pixels are resized to match the desired output resolution.
What we have discussed so far, and what is shown in the picture above, generally holds for most devices. However, when digital stabilization is enabled, there’s an additional issue: the part of the sensor that is used for video acquisition may vary from frame to frame (e.g., based on information from the device accelerometer) so to facilitate stabilization. Moreover, the frame may also be rotated/warped, not just resized, as part of the stabilization process. This has a serious impact on PRNU analysis because it means that different portions of the sensor are used when capturing different frames. As a consequence, we cannot simply “put together” the PRNU extracted from every single frame: they are misaligned!
How to understand if your video is digitally stabilized
With all of this in mind, let’s go back to the original problem. The first thing we need to understand is whether questioned video.mp4 has been captured using digital stabilization or not. According to a recent paper, an effective test is to create two PRNU reference patterns, using two separate video chunks, and check if they match or not. If they match, it’s likely that the video has not undergone digital stabilization, while if they don’t match, chances are that stabilization took place.
Extracting frames
Extracting frames from a video is very easy if you have Amped FIVE. Drag the video in, use the Range Selector to select a range of frames to be extracted. Then, use the Sequence Writer to write frames to file. Just remember to set an output format that does not apply lossy compression, such as BMP. If you don’t have Amped FIVE (which is so bad!) you can use FFmpeg calling this command ffmpeg.exe -i <video_filename> $frame%05d.bmp. This should extract all frames to the same folder from which you called FFmpeg. Then, you can put the frames you want in a folder. More refined commands exist that allow extracting only a range of frames.
Let’s practice and extract the first 500 frames of questioned video.mp4 and compute a CRP from them. After frames have been extracted according to the instructions above, run Amped Authenticate and go to the PRNU Identification filter and click on the Create PRNU Reference Pattern button. We set the folder containing frames as the “Reference images” folder. We may set the “PRNU CRP Filename” to be: first500_frames.crp.
Then, we have to do the same but using other 500 frames (e.g., the last 500 of the video, the only important thing is that the two chunks do not overlap). We’ll thus create another CRP called last500_frames.crp
Comparing CRPs
Finally, we compare the two CRPs. We can use the Compare two CRPs function provided by Amped Authenticate’s PRNU Identification filter.
Then, we click on the Ok button and this is the result:
We get a (largely) positive match. So, we can reasonably conclude that this video is not affected by digital stabilization, either because such feature is not supported/enabled on the device, or because the device was perfectly still so stabilization did nothing.
Now, for the sake of curiosity, we may try to redo all of this for the video in the stabilized folder, which comes from an Apple iPhone 6 device HybridTest\stabilized_video\D06_V_outdoor_move_0001.mov
As expected, we get a negative match, for the reasons explained before. At the time being, Amped Authenticate cannot deal with stabilized videos.
Luckily, our questioned video.mp4 is not affected by digital stabilization. You can verify that the same holds for the two videos in the \reference folder, so we can assume that all the videos we’re dealing with in our task are NOT affected by digital stabilization.
Now, let’s go back to our main task. The questioned folder contains a still image and a video to be attributed. Since this article is about hybrid source identification, we will try to attribute the questioned image against reference videos. We will try to attribute the questioned video against reference images.
Source attribution on mixed media
This situation is called “hybrid” source identification in the scientific literature. Source identification is still possible, but we have to account for the transformation (scale and crop) undergone by the video. Good news: since update 16636, Authenticate supports an advanced PRNU matching algorithm that can iteratively search the combination of scale and crop! We just need to properly configure it.
Let us treat separately the two tasks:
- Attributing questioned video.mp4 using images of the reference devices
- Attributing questioned image.jpg using videos of the reference devices
Let’s start with the first case.
Source attribution of a video when reference images are available
We start by creating a CRP using the available reference images, the usual way. In our case, we create a CRP from images in folder reference\D01\images and another one from images in folder reference\D17\images.
Extracting frames
Then, we extract frames from questioned video.mp4. How many of them? Of course, the more the better. If the video has not been re-encoded after the acquisition, using at least 500 frames should be enough in the general case. We then generate a CRP using the extracted frames (if you have made the stabilization test before, you should already have computed the CRP using the first 500 frames of this video, so you can reuse that one). Let us now open the Compare two CRPs tool.
Comparison
We’ll have to test the evidence CRP we created against the two reference CRP obtained from images. Let’s run the comparison against D17 first. We load the two CRPs. Important: we need to set the CRP generated from video frames as the “First CRP file:”, and the CRP generated from images as the “Second CRP file:” (don’t swap them! We’ll explain why later).
Scaling
Then, we click on the “Edit” button close to the “Advanced settings file” line. We need to enable the advanced search, so we just delete the “false” and substitute it with “true”. Configured this way, the PRNU matching algorithm will scale the first CRP (the video one) with all scales ranging from 0.5 to 2, increasing each time by 0.01. Actually, we know that what should be applied here is an upscaling of the video CRP (because the sensor has been scaled down to lower the resolution to a Full-HD one, and we need to invert this). So we may restrict the search range by setting the minimum scale to 1 instead of the default 0.5. This will just save some time.
We may want to halve the scale step to conduct a finer search, lowering it to 0.005 (or even to 0.001, it will just take a bit longer). So here is how it reads after the proposed modifications.
{
"Advanced Search Enabled" : true,
"Minimum Scale" : 1,
"Maximum Scale" : 2.0,
"Scale Step" : 0.005,
"PCE Cutoff Value" : 200,
"Rigid Rotations (CW)" : [
0,
180
],
"PCE Threshold Override" : 65
}
We hit the “Ok” button and wait a few minutes. We’ll be presented with the output.
Negative! So let’s try against the other reference device.
Hooray! That’s quite a match! Besides the high PCE value, it is noticeable that such value was obtained with Scale = 2.0: according to published data (Table 2 of the linked paper). The Samsung Galaxy S3 Mini uses just that scaling factor when downsizing pixels to match the video resolution. So, this acts as a further confirmation. Based on these tests, we could say that findings lend strong support to the hypothesis that the questioned video was acquired by device D01.
Source attribution of an image when a reference video is available
Extracting frames
In this case, we’ll start by extracting frames from the two reference videos and creating a reference CRP from them. Then, and this may sound a bit strange at first, we need to create another CRP using the single evidence image. Just put questioned image.jpg in a folder alone. Load it in Amped Authenticate and click on the usual Create PRNU Reference Pattern button. Set the folder as the reference image, and a proper output filename. After clicking Ok, you’ll be warned that using few images is not a good practice, just click Ok to close the message and wait for the process to finish.
Now that we have two CRPs, we can just copy-paste the procedure explained for the previous situation.
Comparison
Let us first check our image against the CRP obtained from D01’s reference video. As explained before, we have to set the CRP generated with video frames as the First CRP File, and the other CRP as the Second CRP File. This is what we obtain:
The PCE value is just above the threshold. When running the advanced search, billions of possible matches are tested. So, we may reasonably expect some false positives with limited PCE value. Moreover, contrary to the previous case, the detected scale is not the one expected (we expected 2.00 for the Samsung Galaxy S3 Mini, here it’s 1.040 so very far). I would say that the above result lends very weak support to the hypothesis that the questioned image was captured with D01. In a case like this, I would redo the test using more frames to create the video’s CRP. Possibly, I would use also different reference videos if available, to have further confirmation.
Let’s now test against D17.
Resizing
We see that, in this case, a simple resizing was enough to find a much stronger match. Why was only resized needed, and no crop? It’s just because of this smartphone (it’s a Nokia Lumia 640 LTE) having a 16:9 format also for images. So, no crop is needed to reach the 16:9 format for the output video, scaling is enough. And even in this case, the detected scale matches the one known in the literature for this device (see Table 2 of the linked paper).
You may have noticed that, throughout our experiments, we didn’t consider just the PCE value. However, also the Scale when evaluating results, taking advantage that the expected scale for the two reference devices was made available by researchers. This is indeed a good way to reduce the chance of false positives. In case you are testing against a reference device for which the expected scaling factor is not known, you may consider obtaining some images and videos from a device of the same model and use those to make some tests. The scale value, indeed, does not depend on the specific exemplar. It rather depends on the model (and, possibly, on the firmware version).
Conclusion
A final note: you may be wondering why we had to create a CRP from a single image. Couldn’t we just load the CRP generated with videos in the PRNU Identification filter, and then load the questioned image as the evidence image? That would not work, because the PRNU Identification filter always transforms the evidence image to match the currently selected CRP. However, in this case, we need to do the opposite: we need to transform the CRP generated with the video and use the other one as the target. (That’s because the searching algorithm will stop when the upscaled CRP becomes bigger than the reference one.) By creating a CRP from that single image, we can put it in the second row of the Compare CRP tool, thus obtaining the desired situation.
And that’s it! It was a bit longer than usual, but we hope you’ve found it worth the time. Stay tuned and don’t miss the next Tip Tuesdays. You can also follow us on LinkedIn, Twitter, Facebook or YouTube: we’ll post a link to every new Tip Tuesday so you won’t miss any!