CSI Video Enhancement, Finally Someone is Getting It

I am finally starting to see some articles which explain nicely what we have been trying to explain for a long time, regarding the video enhancement techniques that you see in CSI-type shows.

I just discovered a very nice article on How to Geek: Stop Believing TV’s Lies: The Real Truth About “Enhancing” Images. I already tried several times, even with our most recent blog post “The Untold Secrets of Forensic Video Enhancement: Myth versus Science”, to explain what is feasible and what is not feasible for enhancement, but the How to Geek guys do an even better job.

enhance

I just want to cite and comment on a few parts, that I really like:

It’s one of the most common tropes in television and movies, but is there any possibility a government agency could really have the technology to find faces where there are only blurry pixels? We’ll make the argument that not only is it impossible with current technology, but it is very unlikely to ever be a technology we’ll ever see. Stick around to see us put this trope under the lenses of science and technology, and prove it wrong once and for all.

This is very close to what I always preach: when video enhancement does not work, it’s not a limitation of our technology, but that what you are looking for simply isn’t there. And don’t wait for better technology, because this is a limitation that technology cannot overcome. What is not there is not there. Period.

Then they add a very clear analogy with simple math:

Imagine you’re doing algebra homework on your computer. You plug in a series of numbers into your “Y = X+1” equation. First, X = 1, so 1 + 1 = 2. But what would happen if you pushed the wrong keys, and input the wrong numbers? Would you still get the correct answer? If you meant to say X = 1, but typed X = 11, would the computer still give you the correct answer? The question is, of course, preposterous. This is the concept of “Garbage In, Garbage Out.” In other words, the wrong data will give the wrong answer.

Like our equation, “enhanced” images are a function of the original image. When you start with a blurry or pixelated image (or even a sharp clean one, for that matter) no amount of filters or computer magic can coax information out of a place where that the information simply doesn’t exist. Just as “1 + 11” will never result in “2,” a limited image will never result in the so-called “enhanced” version.

But it gets better:

You might ask the question, “Isn’t is possible to create a function that can add detail to a bad image?” Well, we’re not likely to create one anytime soon. Simply because we recognize an arrangement of pixels as a face does not mean that it’s an actual face. The face part is our perception of that data—we are in fact only looking at data!

I have to say that I love these guys. I’ve never seen “non industry” guys getting and explaining this so well. A digital image is first and foremost data, we see it with our eyes and our brain recognizes it as an “image”.

Going on it gets even more specific:

It may be possible to create some kind of face-like image from garbage image data, but this doesn’t mean that that product will be relevant. It might create a face that doesn’t actually look anything like the person that was actually there. It would more likely just create a mass of pixels that sort of just looks like a “different” version of what’s there.

This is the reason why we must be very careful in the algorithm we choose for enhancement. There are some enhancing algorithms that try to improve a face with a basic knowledge of how an average face looks. Even though these algorithms may give some visually appealing results they are very dangerous in a forensic setting and should not be used on evidence images, since it’s very unlikely that the guy we are looking for is similar to the “average face”.

Then they proceed in explaining “How to Know The Government Secretly Isn’t Doing this Impossible Thing”. With some knowledge of the industry, I could simply reply, that if that was true they wouldn’t buy our products. And luckily we have successfully been in the market for quite a few years now…

They add also this interesting point of view:

If you can “enhance” an image by zooming in on a face in a crowd, why not go outside, take a snapshot of the sky, and “enhance” it to see the details on the ground of Pluto? If this was possible, an image—any image—could conceivably contain all the image data in the universe.

Pretty extreme, isn’t it?

But luckily, not everything is so bad, and they close the article with “our” point of view, trying to answer the question “Is Actual Useful Image Enhancement Possible?”

The difference is the data is already there—we’re just looking at it a different way. Our eyes can’t see (depending on your monitor) the detail in the face on the left. But the “enhanced” version on the right shows us plenty of detail in the shadow, giving us a better picture of his face.

This is exactly one of the purposes of our software Amped FIVE.

Overall, this article provides another point of view to what I have written in my previous blog post:

The bottom line is this: we are analyzing the picture to get some information (identifying a face, reading a license plate…); if this information is present in the image but not visible because of some defect, within certain limits and with the proper scientific procedure, we are able to recover it and make it visible. If the information is not there in the original footage, we cannot (and we must not, since we are in a forensic context and not the photo editing or art world) add it or recreate it.

And now, let’s come to the sad part. User comments. I don’t want to start a flame war, and I can’t since for some reason comments to this article have been closed (otherwise I’d try to write my comments there). Certainly I don’t expect anybody who commented to be a forensic or photography expert, but I hate when people try to make themselves seem like experts telling completely wrong things, since this information may deceive and confuse users who are trying to get the correct information.

This is very similar to another issue I often have: people think they are able to judge some visual evidence with no special skill, this is just a photo after all, isn’t it? When I am contacted about  some case, many people give their opinion: “We can almost see the license plate. Shouldn’t be a problem with your software. This image is much better than the samples you have on your website”. But unfortunately it is very difficult to give a correct assessment without the proper experience. Regularly “I almost see the license plate” means that you cannot go further than that, while some seemingly impossible cases may actually have a solution.

Getting back to the post, most of the commenters added some useful and relevant information, like this one:

I would have thought that it might be possible to analyse a series of images, ie a video clip, by performing some sort of auto-correlation on the data recorded. My memory from 1960 is of being a ‘probe monkey’ for aircraft resonance tests and the recorded jittery vibration traces being digitised and then analysed to reveal the natural vibration frequencies and associated decay rates of a structure. Basically it pulled out statistically using autocorrelation what was common along a trace and eliminated the noise that was also present in the trace. What can the FBI, GCHQ, etc., do with video?

Good comment! In fact, the most powerful techniques (like frame averaging and super resolution) combine information coming from multiple frames to get better quality.

But then comes some comment from another guy, which is completely wrong.

Photographs can and are made legible and non fuzzy by a process called Interpellation. It’s been ued for over One hundred years

The term Interpellation is pretty funny by itself, but it’s clearly a typo since “interpolation” is properly written later.

If you have zoomed in until the photo becomes fuzzy you use interpolation to start placing similar pixils in between the real ones, For instance if after enlarging a photo of a face, and it becomes fuzzy you place lip colored or eye colored pixils nest to and in between real lip and eye pixils, can do it for the rest of the face.

Of course, interpolation is not new and it’s widely used basically in any software and device, but it’s not adding any new information you don’t see in the original picture, it’s just making the pixels “nicer” to our eyes. If between lips and eyes there was a spot on the face which was not visible in the original image, interpolation wouldn’t make it visible. Very roughly put, interpolation tells us something like this: if one pixel is white and another is black, the one in the middle should be grey. But we don’t know if there was additional detail in the “missing” pixels.

You may have do use several different algorithms for different kinds of photos, Interpolation has been around since the Civil War,although not near as sophisticated as now. Your eyes use interpolation all the time and we don’t realize it, I figured the govt could do it during the Cuban Crisis. We almost had a nuclear war when the Russians started putting missiles in Cuba. Were we going to war based on those unintelligible pictures printed in the newspapers?. They were were labeled showing us which were the rockets and which were scaffolds etc, but I couldn’t recognize anything.

Mmm… I don’t think they enhanced the bad pictures of newspapers, they may have simply made publicly available only the low-quality pictures. Makes much more sense.

For Lords sake, ask around or punch it up before posting something so misleading. Since many people don’t read the comments, I hope you post a correction.. By the way, photographers use it all the time. It is not a military secret..

Interpolation definitely is not a military secret, but it’s kind of crazy that one of the first well written, complete and correct articles on the topic is criticized by someone who doesn’t understand at all the issue.

But it gets even worse:

I thought it worth mentioning that a photo taken with a digital camera will have an embedded thumbnail image in the data. This image doesn’t change regardless of how you alter the actual photo, in photoshop or most any other program that will blur or pixel the image – The thumbnail will remain intact of the original image. In most cases this thumbnail can be ‘blown up’ and can be used to uncover what the user has blurred or blacked out, etc. So If the photo has this data, and most do – it’s nothing to use the thumbnail to restore the edited info back to something that can be used.

This is totally not true and not relevant to the content of the article, for the following reasons:

  • Embedded thumbnail image is a low resolution version of the original image (a typical resolution may be 160×120 pixels), so, unless the processed image has been blurred or downsampled a lot, it would still be better than the thumbnail. There are also other embedded images (usually called “preview”) which have better resolution (a typical size may be 640×480) but still are much worse than the full image.
  • The thumbnail will not remain intact after editing. What usually happens is:
    1. The thumbnail and other metadata is stripped, so there is nothing useful.
    2. The thumbnail is replaced with the low resolution version of the edited picture.
  • Most of the pictures on the internet don’t have the thumbnail or other useful metadata.

For completeness, it must be said that old versions of Photoshop and Paint.NET had this bug which didn’t modify the thumbnail. In 2006 this bug affected a celebrity who posted a cropped version of her face, but fans who downloaded the picture could see her naked in the embedded thumbnail. But this is not the norm and this bug has been long since corrected.

As usual the Internet provides a wealth of information, but it is not always easy to differentiate correct information from garbage.