Understanding our R&D capabilities: Data Supply

30 September, 2022

By Freddie Lichtenstein and Alan Gould

Understanding our R&D capabilities: Data Supply

“As with so many aspects of modern life, data is at the heart of everything we do. The generation, analysis and provision of data is a key part of our contribution to the fight against online crime.”

Our core mission is to provide investigators with the insights needed to safeguard victims of online exploitation, identify those responsible, and eliminate this threat for good.

Furthering this mission can take multiple approaches. Our efforts range from increasing industry awareness and education to collaborating on advanced R&D projects. We also supply actionable data to law enforcement agencies to both support ongoing investigations and strengthen their existing tools.

We apply our expertise in crawling, scraping and processing data at scale using techniques that range from our own proprietary image processing and analysis methodologies to best-in-class academic and commercial innovations.

How we supply data at CameraForensics

Supplying data to law enforcement agencies usually takes one of two forms. Data can be supplied by ‘pushing’ more intelligence directly to the user as we uncover it, or by being ‘pulled’ into other systems on demand to increase their effectiveness.

Pushing data to bolster the capabilities of Project Arachnid

Project Arachnid is a good example of how we supply data to better flag illegal material and identify new targets for investigative attention.

The hash of every image that our crawlers encounter as they scour the open web is compared to the hash list of known material that Project Arachnid has created. This determines the likelihood of it being previously undetected abuse material.

If our crawled image is known to be harmful and is already on Project Arachnid’s list, we’ll send the online location of the image straight to Project Arachnid so that they can better understand the trajectory of imagery online.

If it isn’t on Project Arachnid’s list, then there is a high possibility that it isn’t abuse material. In this circumstance, we’ll use another classifier to confirm the nature of the image. If we still believe the image contains abuse material, the image may well be new material. We’ll then alert Project Arachnid and label the image as a high priority for appropriate action.

Pulling data to support targeted intelligence

For users that want to deploy our crawling capabilities to investigate specific topics, or generate intelligence on an investigative target, we can supply supporting data to aid their research by directing our crawling and scraping tools to targeted areas of the internet.

Rather than doing this manually, automating the process saves users a significant amount of time, and dramatically increases the chances of locating key information. Providing relevant data at scale can significantly improve the validation or assessment of novel approaches in data science and analysis thus providing more robust findings and efficient deployment of available resources in the fight against illegal activity online.

Responsibly using data

All the data that we supply to other organisations must meet strict criteria. We assess the need for all datasets we collect, and law enforcement users must also perform additional assessments before any data can be used.

Want to learn more? Read our blog on the importance of responsible data storage.

Operating with a victim-first mindset

Our core priorities are always safeguarding victims of online exploitation. In our work supplying data to users and helping to enhance the performance of classifiers and crawlers, this is no different.

By reducing the time to insight, mitigating the risk of false positives, and supplying intelligence on trends as they circulate online, we help law enforcement agencies to drive positive change both nationally and worldwide.

Read our blog for the latest views, insights, and news from the CameraForensics team.


Subscribe to the Newsletter