Experimental Validation of Amped Authenticate’s Camera Identification Filter

Table of Contents

Reading time: 9 min

We tested the latest implementation (Build 8782) of PRNU-based Camera Identification and Tampering Localization on a “base dataset” of 10.069 images, coming from 29 devices (listed in the table below). We split the dataset in two:

  • Reference set: 1450 images (50 per device) were used for CRP estimation
  • Test set: 8619 images were used for testing. On average, each device was tested against approximately 150 matching images and approximately 150 non-matching images.

It is important to understand that, in most cases, we could not control the image creation process. This means that images may have been captured using digital zoom or at resolutions different than the default one, which makes PRNU analysis ineffective. Making use of EXIF metadata, we could filter out these images from the Reference set. However, we chose not to filter out such images from the Test set: we prefer showing results that are closer to real-world cases, rather than tricking the dataset to obtain 100% performance.

Using the above base dataset, we carried out several experiments:

  • Experiment 1) testing the system on images “as they are”
  • Experiment 2) camera identification in presence or rotation, resize and JPEG re-compression
  • Experiment 3) camera identification in presence of cropping, rotation and JPEG re-compression
  • Experiment 4) discriminating devices of the same model
  • Experiment 5) investigating the impact of the number of images used for CRP computation.

Test results are shown using Receiver Operating Characteristic (ROC) curves and their Area Under the Curve (AUC). If you’re not familiar with ROC and AUC, you may want to take a look here.

Table 1: complete list of devices in the dataset. By clicking on device name, you will be sent to the corresponding page on DPreview.com

MakeModelTypeRelease yearResolution
AppleiPhone 4Phone20115MP
BenQDC4330Compact?3MP
CanonPowershot A75Compact20043MP
CanonPowershot s2isCompact20055MP
CanonPowershot SD630Compact20066MP
CanonIXIUS v2Compact20022MP
CasioExilim ex-z60Compact20066MP
CasioExilim ex-z75Compact20077MP
CasioExilim ex-z70Compact20067MP
DigCamSub-6MPCompact? 
FujifilmFinepix A350Compact20055MP
HPPhotosmart 320Compact20022MP
Nikone-coolpix-s570Compact200912MP
NikonD70DSLR20046MP
NikonD80DSLR200610MP
NikonD100DSLR20026MP
NikonD300DSLR200712MP
Nikone2500 (Coolpix 950)Compact19992MP
Nikone885 (Coolpix 885)Compact20013MP
NikonE7900 (Coolpix 7900)Compact20057MP
NikonE880 (Coolpix 880)Compact20003MP
Nokian95Phone20075MP
Olympusu760Compact20077MP
OlympusE300DSLR20048MP
PanasonicLumix dmc-tz3Compact20077MP
PanasonicLumix dmc-fs15Compact200912MP
PentaxOptiosCompact20033MP
PremierDC-3320Compact20013MP
SonyCybershotCompact?2MP

Experiment 1: source device identification on the base dataset

Using the Base dataset, we ran experiments and stored the obtained PCE value for each matching case (that means, comparing an image to the CRP of its originating device) and each non-matching case (we were comparing an image to the CRP of a different device). Figure 1 shows how PCE values are distributed for the non-matching cases and matching cases. We see that PCE values are higher than 100 in most matching cases, while they spread in a seemingly exponentially decreasing way between 0 and 25.

Figure 1: histogram of PCE value for matching cases (red bars) and mismatching cases (green bars)

Intuitively, what we need to do is to set a threshold for the PCE value. Then, when comparing an evidence image to the CRP that supposedly generated it, we’ll compare the obtained PCE against the threshold: if it’s above the threshold, we’ll say the image is compatible with the device, otherwise not. Using our experimental data, we can decide where to set the threshold. Choosing too low values will result in “compatible!” quite often, so we’ll have many true positives (which means that a matching image is correctly classified as matching), but we’ll also have many false positives (non-matching images erroneously classified as matching).

Choosing too high values would bring the opposite problem: we’d remove many false positives, but we’d also lose many true positives. This is where the ROC curve comes in handy. We let the threshold span a large set of possible values and annotate the true positive and false positive rate associated with each value. Then, we consider false positive rate versus true positive rate for each threshold value, as in Figure 2. We see that for a threshold value of 26, the classification algorithm yields 94% true positive rate for 0.1% false positive rate. Keeping in mind that the Test dataset contains images that may have been zoomed, this result is undoubtedly positive overall.

Figure 2: ROC curve for the base dataset

Table 2 shows the performance achieved by the system on each device, using 26 as the threshold. We notice that the Panasonic Lumix DMC-TZ 3, together with the Nikon Coolpix S 570 and the outdated Premier DC-3320 devices are the most critical to the system (true positive rate below 90%). For other devices, however, performance is greatly appreciable. Noticeably, on our Test dataset, the true negative rate never gets lower than 97%, and it is 100% for 17 devices.

MakeModelSource identification performance
AppleiPhone 499% (TP: 98 | TN: 100)
BenQDC433099.5% (TP: 100 | TN: 99)
CanonPowershot A7598% (TP: 98 | TN: 98)
CanonPowershot s2is100% (TP: 100 | TN: 100)
CanonPowershot SD63099% (TP: 100 | TN: 98)
CanonIXIUS v2100% (TP: 100 | TN: 100)
CasioExilim ex-z6090.9% (TP: 82 | TN: 100)
CasioExilim ex-z7593% (TP: 91 | TN: 95)
CasioExilim ex-z7095.9% (TP: 92 | TN: 100)
DigiCamSub-6MP97.0% (TP: 95 | TN: 99)
FujifilmFinepix A350100% (TP: 100 | TN: 100)
HPPhotosmart 32099.5% (TP: 99 | TN: 100)
Nikone-coolpix-s57084.3% (TP: 69 | TN: 100)
NikonD7099.5% (TP: 100 | TN: 99)
NikonD8099.5% (TP: 99 | TN: 100)
NikonD10099.0% (TP: 100 | TN: 98)
NikonD300100.0% (TP: 100 | TN: 100)
Nikone2500 (Coolpix 950)97.0% (TP: 94 | TN: 100)
Nikone885 (Coolpix 885)98.0% (TP: 96 | TN: 100)
NikonE7900 (Coolpix 7900)98.5% (TP: 100 | TN: 97)
NikonE880 (Coolpix 880)98.5% (TP: 100 | TN: 97)
Nokian95100.0% (TP: 100 | TN: 100)
Olympusu76098.0% (TP: 98 | TN: 98)
OlympusE30097.0% (TP: 100 | TN: 94)
PanasonicLumix dmc-tz376.8% (TP: 54 | TN: 100)
PanasonicLumix dmc-fs1592.4% (TP: 87 | TN: 98)
PentaxOptios100% (TP: 100 | TN: 100)
PremierDC-332084.8% (TP: 70 | TN: 100)
SonyCybershot100% (TP: 100 | TN: 100)
Table 2: device specific source identification performance

Experiment 2: camera identification in presence or rotation, resize and JPEG re-compression

In this experiment, images are processed and saved as JPEG before entering the source identification algorithm. Processing includes a combination of resize, rotation and lossy JPEG compression. The parameters characterizing each operation are shown in the following table.

ProcessingParameters
ResizingBilinear algorithm, resize ratio randomly sampled between 50% and 150%
RotationRandomly chosen between 90, 180, 270 degrees
JPEG compressionRandomly chosen between [60, 70, 80, 90, 100] (standard JPEG quality)

The overall performance is shown in Figure 3: while there is a sensible loss in terms of true positive rate compared to test on the Base dataset, the source identification system is still reliable, with an AUC of 0.94, and a true positive rate over 80% when allowing only 1% of false positive rate. If we stick to the same threshold for the previous experiment (26), the system yields 81.5% true positive rate for a 0.2% false positive rate.

Figure 3: source identification performance in presence of resize, rotation and re-compression

In order to understand the impact of each processing, we first restrict the analysis to those images that did not undergo any resize. Results are in Figure 4. We see that the AUC increases, and we get a true positive rate of 92% for a false positive rate of 1%. Since these values are very close to those in Experiment 1, we can argue that rotation and re-compression do not hinder significantly the performance of the source identification system.

Figure 4: results in presence of rotation and re-compression only

Now, let’s compare results when images are resized, rotated and compressed at slight or moderate quality (Figure 5). We notice that stronger JPEG compression negatively affects the performance of the system, leading to approximately a 10% loss in terms of true positive rate.

Figure 5: results in presence of resize, rotation and slight (top) or strong (bottom) final compression

We find it interesting to compare the effect of down-scaling (top plot in Figure 6 to that of up-scaling (bottom plot in Figure 6). It is evident that device identification is harder to achieve when the image is downscaled (73% true positives for 1% false positives), while upscaling does not affect performance significantly (89% true positives for 1% false positives).

Figure 6: comparison between down-scaling (top) and up-scaling (bottom)

As to the rotation factor, Figure 7 shows that it is rather uninfluential, meaning that different rigid rotations do not influence the accuracy of the system.

Figure 7: impact of rotation factor

Experiment 3: camera identification in presence of cropping, rotation and JPEG re-compression

This experiment is similar to Experiment 2, but resize is substituted by crop. Images are processed applying cropping, followed by rotation and final JPEG compression. Parameters for each processing step are as in the following Table.

ProcessingParameters
CroppingFrame cropping, randomly chosen in the interval 0-20% of the image size
RotationRandomly chosen between 90, 180, 270 degrees
JPEG compressionRandomly chosen between [60, 70, 80, 90, 100]

Results are shown in Figure 8. We see that performance is in line with those of the resize case (see Figure 3).

Figure 8: results in presence of cropping, rotation and compression

Similar to the previous case, the final JPEG compression plays an important role: when the last compression is at good quality (left plot of Figure 9), the effect of cropping on performance is very limited. On the other hand, when the compression is moderate (right plot of Figure 9) the system loses around 10% in terms of true positive rate.

Figure 9: influence of slight (top) and moderate (bottom) JPEG post-compression in presence of cropping

Interestingly, the amount of cropping is not as relevant: as Figure 10 shows, the fact that a small (left plot) or relevant (right plot) amount of cropping is applied does not influence performance significantly.

Figure 10: impact of the amount of cropping (top plot: small amounts – bottom plot: relevant amounts)

Experiment 4: discriminating different exemplars of the same model

This experiment investigates if the source identification system can discriminate images generated by different exemplars of the same brand and model. We tested 5 different models, two exemplars each, as explained in the following Table:

MakeModelTypeYearResolutionExemplars
AppleiPhone 4sPhone20118MP2
CanonEOS 10DDSLR20036MP2
CanonEOS 40DDSLR200710MP2
CasioExilim EX-Z70Compact20067MP2
OlympusD-560Compact20033MP2

During experiments, 40 images were used for each device to create the reference pattern; all remaining images from the same exemplar were used as the positive set. The negative set is composed by all available images coming from a different exemplar of the same brand and model, plus a random choice of images coming from different devices.

As Figure 11 shows, performance is totally in line with those in Experiment 1. This experiment suggests that the source identification system effectively discriminates images coming from different devices even when they belong to the same brand and model.

Figure 11: performance when devices of the same brand and model are considered

Experiment 5: investigating the impact of the number of images used for CRP computation

One key step in source device identification is CRP estimation. When the device is available, the best practice is to take some shots of a flat wall or a sunny sky (so-called “flat-field” images), with no zoom and with little or no compression, and then use them for SPN estimation. In many cases, however, the device is not available: a reliable SPN estimate can still be obtained also from “common” images, but in this case, a greater number is recommended.

In this experiment, we investigate how the number of images used for SPN estimation impacts system performance. In our dataset, for most devices, we have natural images only, while flat-field images are available for three of them (Table 3). Notice that, for this experiment, we excluded the Lumix DMC-TZ3 device as it behaved as a negative outlier in Experiment 1, and would complicate interpretation of results.

MakeModelType
AppleiPhone 4Flat-field
CanonPowershot A75Flat-field
CasioExilim ex-z70Flat-field
BenQDC4330Natural
CanonPowershot s2isNatural
CanonPowershot SD630Natural
CanonIXIUS v2Natural
CasioExilim ex-z60Natural
CasioExilim ex-z75Natural
DigCamSub-6MPNatural
FujifilmFinepix A350Natural
HPPhotosmart 320Natural
Nikone-coolpix-s570Natural
NikonD70Natural
NikonD80Natural
NikonD100Natural
NikonD300Natural
Nikone2500 (Coolpix 950)Natural
Nikone885 (Coolpix 885)Natural
NikonE7900 (Coolpix 7900)Natural
NikonE880 (Coolpix 880)Natural
Nokian95Natural
Olympusu760Natural
OlympusE300Natural
PanasonicLumix dmc-fs15Natural
PentaxOptiosNatural
PremierDC-3320Natural
SonyCybershotNatural

Table 3: list of devices used in Experiment 4, specifying the type of images available for SPN estimation.

We repeated Experiment 1 separately for these two groups, while iteratively increasing the number of images used for estimating the SPN, moving from 5 to 50. This means that, at each iteration, a new group of images was

added to SPN estimation, while retaining those used in the previous step. Results are in Figure 12 for the case where natural images are available, and in Figure 13 for the case where flat-field images are available.

Plots highlight two relevant facts:

  1. When natural images are used for SPN estimation, performance improves significantly until 30 images are used, then converge.
  2. In the case where flat-field images are used for SPN estimation, performance converge is much faster, so that using 20 or 50 images does not make a relevant difference.

The above facts are totally in line with the theory about SPN estimation. It is well known, indeed, that a better estimate is obtained when reference images have little variance (as flat-field images do). On the other hand, when natural images are used, it takes more of them to reach the same SPN pureness (which leads to better device identification). Intuitively, this is because there is more content-related information to be “averaged away”.

From a practical point of view, this experiment suggests that, if natural images are used for SPN estimation, at least 40 images should be employed. This number can be lowered to 20 when flat-field images are used. We should consider, however, that for the flat-field case our dataset contained only three devices, which does not allow us to reach strong conclusions.

Figure 12: ROC curves obtained using different amounts of natural images for SPN estimation

Figure 13: ROC curves obtained using different amounts of flat-field images for SPN estimation

Table of Contents

Share on

Subscribe to our Blog

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Subscribe to our Blog

Related posts