Can AI Be Used for Forensics and Investigations?

Introduction

I’ve always been quite skeptical about the use of AI for forensics and investigations, as you may have seen in some of my older posts. In recent years most of the advancements in image and video enhancement, analysis, authentication, and tampering detection have been based on AI techniques. I don’t think we can exclude anymore anything which has something to do with AI, but it should be handled with extreme care. In this article, I will explain why in some contexts AI may be acceptable for forensics if used within some tight boundaries and safeguards.

Some terminology

I want to define here a few terms and their meaning in the context of this article, whose aim is not to give a rigorously formal explanation, but mainly to describe in simple terms a rather complicated topic. So, please allow me some approximations with terminology.

I will use more or less interchangeably some of these words and their acronyms: “artificial intelligence” (AI), “machine learning” (ML), “deep learning” (DL), and “neural networks” (NN).

The terms have different meanings, but in the very high-level context of this article, it doesn’t matter much. I will tend to use the term “AI” since it has been broadly accepted by the general public, even though technically it’s very generic and not precise. Probably the most correct one to be used in this context would be “deep learning”.

I will also use the term “traditional algorithm” to refer to an algorithm that is not based on AI. A traditional algorithm is model-based: there is a certain understanding of a topic, and the algorithms exploit this knowledge into a clear procedure. AI algorithms are data-based: we give an algorithm a training dataset to learn by example and, generally, there’s not a deep understanding of the process involved.

I will focus on the use of AI for forensic applications on image and video data, however, similar observations could be made for other kinds of forensics, such as audio forensics, computer forensics, mobile forensics, and also probably for some types of classical “physical” evidence. I also want to specify the fact that in our context the word “forensics” refers to the approach and not the scenario. We consider “forensics” as a post-mortem examination of some data, in contrast to “surveillance”, that would be active real-time monitoring. Even though the word “forensics” implies the use in a legal and judicial context, we are doing a further division between what we are calling “investigative” and “evidentiary” use.

Investigative vs Evidentiary Use

Image and video forensics software is normally used for two very closely related applications: as a tool to get investigative leads (investigative use) and as a tool to analyze data to be presented as evidence in court (evidentiary use). I already expressed this concept in a previous article.

When you are doing image enhancement or analysis in an investigative context, you are just trying to get some “clue” from footage to advance the investigations. You just care about having some initial track for getting more information. Investigative work is normally done by police investigators on the field or by intelligence agencies. While they need to get some actionable data, it is very unlikely that they will ever go to court to present it.

When you are working in a judicial context, your work must be ready to eventually become evidence to be used in court. Hence, your process must be more rigorous and formalized.

Many of our customers use our tools mainly for the investigation phase, but we obviously design them for the most stringent and demanding evidentiary use.

The pitfall from the division above is that often the lines between the investigative and the evidentiary contexts are blurred. What was initially used as a quick investigation hint, may become the strongest evidence and, if the work was not done properly since the beginning, can turn things into a huge mess. Furthermore, even at a purely investigative level, getting on the wrong track can be worse than having no information at all.

Which kinds of image and video processing algorithms are acceptable for evidentiary use?

Image and video forensics are nowadays considered a specialty of digital forensics, which belongs to the big family of forensic science. Hence any kind of processing that we do must fit into a scientific evidence framework. We are now considering the most stringent case of evidentiary use. What can work as evidence will work for investigations, but the opposite is not necessarily true.

Very broadly speaking, we have two kinds of image processing: image enhancement and image analysis algorithms. We use the word “image” here for simplicity, but it could be replaced by “video”, or other kinds of “data” to be used in a forensic context.

Image enhancement algorithms take an input image and, applying on it a specific algorithm, create another output image, usually highlighting or attenuating some features according to the analyst’s needs. Here we use the word “enhancement” for simplicity, but some processing algorithms belong to what is called “image restoration”, or some other generic kind of processing. Typical examples would be contrast and brightness adjustment, interpolation, rectification, deblurring, denoising, sharpening, and so on. In a forensic context, the most common requests are to enhance a license plate or a face for subsequent identification, or to improve the quality so as to better understand the dynamics of an event.

Image analysis algorithms take an input image and, applying a specific algorithm, create some different kinds of data in output, often a decision if the image fits some predefined criteria. Usually, given enough time, this kind of analysis could be done by a human, so the computer simply automates and speeds up the process. Let’s take a few classical examples:

  • Given an image, detect if contains CSAM (child sexual abuse material), weapons, or drugs
  • Given two images with faces, decide if they represent the same individual or not
  • Given an image with a face, identify the most likely matching individuals in a database
  • Given a long video, highlight moments of interest according to predefined rules, for example (there can be many more):
    • All the instances when a car is passing
    • All instances of a person with clothing of a specific color
    • Identify when someone leaves some baggage unattended
    • Detect when a car is driving in the opposite direction on a one-way street

Image authentication and forgery detection algorithms belong to the category of image analysis, but they have some peculiarities we’ll write about later.

What are the requirements for processes in an evidentiary context? I already spoke about these in this post, but I am adding some more information here.

  1. Accuracy: the processes should be as much as possible accurate and free from errors, and if possible the error should be quantifiable. Thus the tools should be free as much as possible from bias and help limit the human bias by the operator.
  2. Repeatability: I should be able to do the analysis again tomorrow or in ten years, and be able to get the same results. If every time I repeat the analysis I get a different result, then this is not scientific evidence.
  3. Reproducibility: If I am the only person in the world able to get a specific result, then something’s wrong. Following my procedure, a properly qualified third party should be able to reproduce the results I obtained from the analysis.

How can the above be respected in practice? Choosing algorithms with the following features:

a) Explainable: the algorithms should be understandable and explainable by a properly competent operator. Given the critical context in which our work is done, it is necessary to understand the inner workings of the algorithms in order to go over the possible scenario limitations of a “black-box testing” approach used to evaluate their accuracy and better guarantee reproducibility.

b) Validated: if possible the algorithms should have been accepted by the scientific community, for example having been published in a scientific journal after a peer review, or in an academic book. If such a reference is not available, a detailed enough explanation of their functioning, in order to be validated independently, may be acceptable. This allows estimating the accuracy.

c) Deterministic: in order to guarantee repeatability and reproducibility, the used algorithms shouldn’t have random components. Since some algorithms need a random number to start with the process a possible (acceptable) workaround is to fix the random number generator seed, in order to get the same exact result every time I run the analysis.

d) No external data: the output should be based only on a combination of the input data and the algorithm, no data external to the case should influence the processing. Notice that this is very important both at the algorithmic level and at the human level in the subsequent analysis.

In general, following these rather simple guidelines should allow working in a forensically sound manner. If some of the points are not respected, the admissibility of the processing is definitely in doubt.

Note that the points above regard only the acceptability of a specific algorithm. Usually, during a forensic analysis, multiple algorithms are used and combined. A proper description and recommendation for the full workflow will be the subject of a future article.

Let’s now dive into the main purpose of this article and see how AI impacts these different aspects.

Image and Video Enhancement with AI

As explained above, image enhancement takes an input image and produces an output image following a specific algorithm. Following the guidelines given above, should be AI algorithms acceptable? Looking at the recent advancements in technology, the visual results are impressive, however, we must be careful of the limits of AI technology. 

In very general and informal terms, how is an image enhancement algorithm trained? We give the system a dataset with a lot of images. For each image, I will have both a high-quality version and one or more low-quality versions. The network will learn by example, to obtain from a “typical” low-quality image a high-quality image according to the data on which it has been trained.

This causes the following issues:

  1. Influence of data external from the evidence. In practice, because of the training process, we are polluting the original evidence with new data, coming from an external source, and this impacts the admissibility of the evidence according to point d) “No external data”.
  2. Processing bias. This is very closely related to the point above. The kind of dataset used for the training of the model necessarily introduces some bias to the data. Let’s take an extreme example. If I train the system only with faces of people with moles, it will likely add moles to every subject, even if they didn’t have them in reality. Again, not good according to point d) “No external data”.
  3. Lack of explainability. AI processing is very complicated and learns by example. It is almost like a black box. We can’t go to court and explain the processing that has been done, because nobody knows (save maybe for very simple toy cases). While there is a part of research focusing on the explainability of AI, we are still far from a solution for our cases. This impacts the requirement a) “Explainable”, we actually may have all the details of the network, but it’s so complicated that nobody knows how it works in practice. 

The points b) “Validated” and c) “Deterministic”, must be also considered, but they are no more critical than for traditional algorithms, save for the fact that the lack of explainability may also make the validation more difficult. A black box approach can allow us to estimate if an algorithm works correctly or not, but doesn’t help us understand why.

According to the above:

  • Image enhancement with AI shouldn’t be acceptable in general for evidentiary use.
  • Image enhancement with AI may be acceptable for investigative use, provided the following safeguards are in place:
    • Disclosure. The operator is extremely careful in making sure the results of the processing won’t be used as evidence in court later on, for example informing the stakeholders and clearly labeling the image with some wording such as “not for evidentiary purposes” or similar.
    • Education. The operator has been trained on the reliability and the pitfalls of using AI processed imagery, is aware of the potential biases introduced by it, both at the technological and human levels, and there are processes in place to mitigate them.

Note that this is my current view in today’s context and with today‘s technologies. As the legal settings may change with time (slowly) and the technology evolves (fast) together with the understanding of AI (who knows if and when), the acceptability can change too.

Furthermore, there are a couple of additional considerations to make. They don’t change the situation, but in my opinion, are important considerations to close the circle.

  • Not all traditional algorithms are good either. Before the advent of AI, image processing algorithms were based on well-understood processes. Even in this situation, not all algorithms were suited for forensics. In forensic analysts circles, sometimes we go into heated debates on very specific topics, such as the right aspect ratio of a video or the appropriate choice of the interpolation algorithm between nearest neighbor, bilinear and bicubic. There are even people that go as far as saying not to use bicubic interpolation or other very basic techniques. That would be excessive in my opinion, since the images will be in any case processed and interpolated, multiple times, during the acquisition, compression, decoding, and display. That said, an algorithm not based on AI is not automatically acceptable, it must still fit the above-described constraints.
  • AI is not always under our own control. Modern devices, such as smartphones, automatically apply a lot of AI techniques to develop and improve the images even before saving them on the device. This is not under the control of the user. Does this mean that we cannot use that imagery as evidence? No, it means that our “original” files have been created in a way that includes some automatic processing that may have modified the appearance of reality, and an analyst must be aware of the potential limitations of technology, being caused either by AI or traditional image processing algorithms.

Image and Video Analysis with AI

Let’s now talk about image analysis. As we have seen earlier, in this case, we have an input image (or video) and produce some different kinds of information, often a decision or a classification. A few classical examples, such as face recognition, CSAM detection, or many different kinds of video analytics based on motion or other heuristics.

Since image analysis doesn’t impact the actual data used for evidence but merely aids its interpretation or speed up the work that could have been done by a human, the acceptability of these techniques for investigation and forensics is less critical than that for image enhancement, but only if subject to the right safeguards.

  1. Decision support system. The decision should help the analyst, not replace him. A system should help the user in focusing his attention on the most probable targets, but the user should always have the last word. For example, a facial identification system may help the analyst identify the most probable matches for a suspect, but the actual analysis and decision should be made by a human with a proper explainable analysis. For example, a face identification match given by an AI algorithm should never be used as forensic evidence but may be used to help the investigator identifying the most likely matching faces.
  2. Known reliability. The system should give some indication about the reliability of the result in general cases and/or the specific situation. This should help the analysts understand how much they can rely on the result given. What’s the usual reliability of the system? Does it work correctly 60% or 99.99% of the time? What’s the confidence level in a specific analysis?
  3. Limit human bias. Analysts should be aware of the possible bias induced by the system over the human user and take the proper steps to mitigate it. If the analyst receives a strong match by an AI system for what regards an identification, it is very likely that he will be unconsciously biased towards a positive match. Similarly, maybe the correct matching face could have been mistakenly discarded by the AI system and thus condition the user to ignore it. In this regard, it’s important to educate the user on the limits of technology and how it can condition the opinion of the operator doing the analysis, and adopt bias mitigation techniques, such as linear sequential unmasking.

The considerations above are valid for both investigative and evidentiary use, but of course, it is more critical for the latter. It must be said that similar concerns could be present also for analysis done with traditional techniques, but the use of AI makes it more difficult to understand what’s happening.

Image and Video Authentication with AI

Let’s now turn to authentication and tampering detection of images and videos. While this is, to an extent, a special case of the image analysis, there are some aspects that make it also similar to image enhancement. In fact, the result of the processing is often another image that the user should be able to interpret.

Traditional image authentication algorithms exploit the fact that we have a pretty good understanding of how an image is generated inside a device and how manipulation is performed. In the past, we were using mostly traditional techniques for smudging the images, changing the colors, cloning, and splicing images. We can exploit the traces left by these processes on the image to determine its processing history.

All the newest kinds of processing, often informally called “deep fakes”, use some sort of AI for the manipulation. Since we don’t understand very much about how these algorithms work, the most used way of detecting them is to train another AI system to do it for us. A battle between two AIs sounds pretty epic, but it’s not as cool to observe as it sounds. 

For now, it’s basically an arms race between new deep fake tools popping up here and there, and newly published methods in literature to detect them.

The considerations are the same as for the other image analysis algorithms. We can use AI algorithms for tampering detection as a clue for further inspection and maybe as a hint that something is wrong. But having something as “98% probability of deep fake detected” alone, without the possibility to analyze what’s behind this decision, can only provide a limited degree of support as evidence of tampering. With traditional techniques, on the other hand, traces of tampering, when present, were rather reliable evidence of manipulation.

In a nutshell

I know this is a pretty long article, which went very quickly over wide and complex topics, but I think my overall point of view can be summarized in rather a simple way.

  • Image and video forensics can be for: 
    • investigative use
    • evidentiary use
  • Image and video processing algorithms can be: 
    • model-based (traditional)
    • data-based (AI)
  • Image and video processing algorithms for forensics can be divided into:
    • Image enhancement: algorithms that process an input image into an output image
    • Image analysis: algorithms that process an input image into something else, often a decision or a classification
  • Image and video processing algorithms for evidentiary use should have these characteristics:
    • Explainable
    • Validated
    • Deterministic
    • No external data
  • Image and video enhancement with AI
    • Shouldn’t be acceptable in general for evidentiary use for the following reasons:
      • Influence of data external from the evidence
      • Processing bias
      • Lack of explainability
    • May be acceptable for investigative use, with these safeguards:
      • Disclosure
      • Education
  • Image analysis with AI:
    • May be acceptable for both investigative and evidentiary use
    • Must have the following safeguards in place:
      • Decision support system
      • Known reliability
      • Limit human bias
  • Notes:
    • Not all traditional algorithms are good either
    • AI is not always under our own control
    • Image authentication is a particular kind of image analysis

I consider this topic a very important one; many governments are working on policies about the use, regulation, and acceptability of AI for forensics. I wanted to share my point of view, based on what I believe to be good scientific and forensic principles and my understanding of the topic. I’d love to hear from you if you have any comments, if you agree with me, or, even more, if you don’t.