Hello!
Like always, it’s been an exceptionally busy period for me and Perspective Intelligence. Towards the end of 2024, we finalised what the ThreatLens attack surface intelligence solution looks like and is well underway in going from concept to real-life thing, particularly when it comes to data collection. In addition, Perspective Intelligence acquired a 25% stake in OSINT training company Kase Scenarios. Our new partnership will see me take on the Head of Training role. I’m working closely with Espen and the team to develop workshops and capabilities for in-person and live online training, hopefully later in 2025! Likewise, hosting webinars and interviews for the UK OSINT Community has continued at pace, with more planned for the foreseeable future. If you haven’t yet signed up as a community member, why not?! It’s entirely free to join, and members get access to all of our events and receive email updates from the community. In addition to access to Slack or Discord, depending on location.
However, on to the topic at hand. Following on from the reception I got from my last couple of posts about approaches to doing OSINT, I wanted to find an example of putting some of the pieces together for a genuine use case. Towards the end of last year, there was a fascinating blog by Arden22 on profiling users of child sexual abuse material (henceforth referred to as CSAM). It’s vital, challenging work. However, it gave me an idea: By taking some of the information within the post, could you potentially leverage information from publicly available stealer logs to identify consumers of CSAM and subsequently identify them online by leveraging tools like Maltego?
As it turns out, yes, you can.
Now, while it’s almost against my better judgement, for the sake of those not yet convicted in a court of law, the contents contained here will predominantly be censored, both because I don’t want someone getting their head caved in if I was wrong (always a chance), but also I wouldn’t want to jeopardise any potential law enforcement investigation or action. This post isn’t about a witch hunt, it’s about proving a methodology – And I’ve already provided information to law enforcement to encourage action being taken.
This post is also a follow-on from something I shared on all the socials at the time – Subsequently, I’ve gained access to some different datasets thanks to some new additions to the data hub in Maltego, with whom I’ve recently signed on as a partner. So, using Maltego to help catch bad guys is top of my to-do list! We’ve also added significantly more data to the ThreatLens repository for data breaches and stealer logs, so it’s always worth revisiting something to see if there may be something else to shake from the tree.
So before we dive in, I need to state the obvious. There is nothing illegal in this post; there is no exposure to awful or abhorrent material or inferences, and this research can be done entirely without ever considering accessing CSAM sites. If you are stupid enough to do that, then the law will not protect you if found out. You’ve been warned. The approach to this research was as follows:
- From the original blog, can we use some information from already obfuscated URLs to identify stealer log records where they may appear?
- Where logs exist within our data, can we subsequently find other identifiers, such as an email address?
- From there, can we take an email address and leverage data connections within Maltego to identify a real-world individual?
So, let’s get into it, shall we?
What is an Information Stealer, and Why Would We Use Them?
So for those unaware, I think it’s incredibly important that we firstly identify what information stealer malware is, what the logs contain and why we would look to leverage them for OSINT investigations and broader cybersecurity. Information Stealers are, as the name would suggest, capable of stealing information from an infected device. The information varies from stealer to stealer, but on the whole, they will contain:
- Login Details stored in web browsers
- Cookies from web browsers
- Identify applications and operating system on the device
- Interesting files that appear in certain directories on the device
- IP address and country of the device
- Cryptocurrency wallet information
- In some instances, screenshots of desktop of other activity
My main use-case for this data is actually for cybersecurity, if I can identify client information within a stealer log, then I know that there is a current and very real threat as their device in addition to remediation steps needed to prevent the initial infection from leading to something worse, which can include ransomware. What’s most important with this, though, is that in most cases, it’s unlikely to be a corporate device that has been infected. Stealers are usually contained within phishing emails as part of malicious documents, within pirated content, including software, movies, cheats for games, and, of course, for pirated adult content.
Predominantly, an individual’s corporate credentials get captured by an infostealer because the victim has stored their passwords within a browser profile and has logged in to both their work and personal devices. This can make it very difficult to identify if you don’t have the capability to detect it (insert plug for ThreatLens by Perspective Intelligence here), and even more so, there’s not much you can do by way of containment. Once the data’s out there, it’s gone, and you need to be quick to implement mitigations.
Simply, the best form of defence from infostealer malware is to use a dedicated password manager such as 1Password, Bitwarden or KeePass, etc., in addition to having multi-factor authentication in place. This will be very useful at stopping most cyber attacks, let alone infostealers alone.
Now, the second use-case for capturing this data as an OSINT/Threat Intelligence person is for the incredible value we can derive when conducting person of interest (POI) investigations. The use case is very similar to why you would query something like LeakPeek or Dehashed for traditional data breach information – To enrich knowledge around account associations, pivot on password re-use and identify further email addresses or usernames of interest. Simply put, it’s an extra repository that is vital to doing a thorough job as an OSINT analyst. However, it is a time-consuming and laborious job to do this manually. The best approach, according to Michael Bazzell, and I agree with him, is for a dedicated, well-equipped device supported by A LOT of SSD storage. If you can’t do that, then you will find either data processing to be slow or you will constantly be deleting things and reducing the chance of success when a new investigation may require some older data.
But where do we get this data from? Well you have two options really, Telegram is still a hotbed for this kind of information being shared freely, and you also have the “naughty forums” as I like to call them. E.g. cybercrime forums. Hopefully, this starts to illustrate why we may want to use a dedicated device for collecting this information.
But what does this look like in reality? Well, normally, you either get an archived file (normally .zip, .rar or .7z) which can contain anything from a single log through to millions of them. In some instances, we have seen downloads in the tens of gigabytes in size, which is a lot of data. The logs themselves also normally present a country code and an IP address:
Once you have an archive, you can see the data presented as such:
From the log itself, for most use cases from an OSINT perspective, you will not care about running processes, software or system information in addition to the other folders. It’s mostly the exposed login credentials that are relevant. If that’s you, then there’s some good news from a collection standpoint. There are dedicated groups and channels that share files called ULP, which stand for URL, Login, and Password, which effectively summarise the logs themselves. They tend to be smaller than log collections as a result, as the information is solely the credential information. Of course, this lacks some wider context if you are particularly more focused on the security angle, but for most OSINT investigations, maybe these files are enough.
Using Stealer Logs to Identify Abusers
So now we understand what infostealers are, what the data includes, and how to find them. How do we then leverage that for an investigation? Calling back to the original blog that inspired me, the idea now is to search through stealer logs to find references to what we know are CSAM domains (in this case, on the dark web). So, what’s the process for this? Well depending on how you set up your data collection, you can use several tools – Grep, RipGrep and QGrep all sound very similar but are different – Grep is a traditional Linux command for searching through files and folders, RipGrep is the same, but on steroids. And finally, QGrep is different. It creates a database of the data you feed it, and in turn allows you to query that data at incredible speed. For example, over 2TB of stealer logs could be searched through in a minute or two. Pretty rapid.
From the blog, I identified a partial URL for a CSAM site, and out of curiosity wanted to see if we could identify a Tor site from that, of course we could. We had collected the information between October and December 2024. As per my original post on LinkedIn, we could identify some IP addresses of interest (thanks to how stealer logs present their information). As it so happens, the day I read Bruno/Arden22’s blog was the day after I’d spoken with the incredible Kevin Metcalf, founder of the National Child Protection Task Force for the UK OSINT Community, and in this instance, I was able to see some IP addresses belonging to US-based individuals.
So this is where we bring in Maltego and data transforms! And this is where I think it starts to develop into something really exciting…
Data Collaboration= Better Results!
So, taking a single IP from the stealer log repository that matched to our known-bad CSAM site (we could confirm the known-bad thanks to things like HTTP Headers that sometimes get included within the stealer logs, to save everyone, just trust me when I say the site is beyond abhorrent). When opening the Maltego Graph, we can leverage a whole range of options for research. In this instance, I elected to leverage DarkSide by District4Labs to see if we could find any prior data breaches that included this IP.
So, did we catch a predator? Not yet, dear reader… However, we did identify several email addresses from previous data breaches that could provide some insight here. Now, law enforcement could potentially correlate the data we have from the stealer log, which should give us an approximate time of infection, with data from the ISP. I can’t do that; I’m just a pillock with a hope and a dream, but I can do, is enrich this data further to provide some extra context for my friends in law enforcement should we get a true positive here. So which data sources would I first think to combine at this stage, I hear you ask? Well, I have two for you right off the bat!
- OSINT Industries
- Pipl
Why these two? Firstly, I think OSINT Industries is the best commercially available enrichment for email addresses currently. There are alternative options, but from what I’ve used and are available to the public, this one wins on both cost and breadth of data. Not only can we enrich the email address with account associations, but it’ll also pull back information on the user where possible, this can include profile images, usernames, phone numbers (usually partial results only) and more. This can be a superpower for finding an individual online quickly.
And Pipl is a very powerful tool for searching for people, this will often be able to tie an email address to a named individual, possibly with an image and also other selectors including other email addresses, full phone numbers and some social media profiles. Running these two transforms on an email address can save you literal hours, if not days, of manual investigative effort.
But what does that look like as a whole?
As you can see, this has quickly become quite a mass of data points to look at. In this case, we identified six named individuals. Any of them could be a predator, and equally likely, none of them could be. This is where I hand the data over, provide my methodology and the source information and offer support, if needed, to the law enforcement officers who could hopefully draw more concrete links. But in this instance, I think this was a compelling demonstration of OSINT For Good. But what about where we have other data to hand?
Catching a Predator
So back in the Summer of 2024, OSINTGuardian shared a series of blogs investigating CSAM abuse on the clear web and, through part of their investigation, obtained a breached database from a CSAM site. This breach was shared publicly, and thus, we’re able to search against it.
So, in this instance, I thought I’d look specifically for users based within the UK, based on email domains. Sure enough, there are more than a few within the data, and we can see when using ripgrep that we can also see user ID, username, secret key and decrypted key. From an OSINT perspective, we’re looking at the email address and username though.
So now, we have a vaguely similar process as before, we’ll fire up Maltego and pick on an email address within the breach at random using a combination of OSINT Industries, DarkSide and Pipl:
If we zoom in on a particular result, we can see a whole bunch of information, though:
Not only do we have a profile image of the suspect, but we also have an approximate location, prior review on Yelp and a partial phone number. Furthermore, if we leverage the Maltego data pass a little further, we can now run some facial recognition searches against the profile image, which in turn bring up some, shall we say, suggestive links.
We can also use data from OSINT Industries and DarkSide again to provide us with some further account associations and prior breaches, some of which are also of interest:
Now, we also have some social media links identified that we could also leverage. When I initially looked at this individual, I was lucky to have access to CrimeWall by SocialLinks for a short period, and I was able to identify this individual’s LinkedIn profile. Subsequently, all information was shared with law enforcement again, along with methodology and guidance on how to replicate the steps.
Putting It All Together
So what would we do next? Well a fuller social media profiling is definitely plausible, and likely has/is being done by the relevant authorities. For us, we could do some more OSINT for social media profiles, and again leveraging Maltego, and particularly some of the new data access like GeoSpy we could look to identify locations from images in addition to our other enrichments. On the stealer log side, this is a use-case that I think showcases exactly why anyone involved in #OSINTForGood should be aware of, or actively collecting information like this to support their investigations.
OSINT For Good
From my perspective, as a small business owner and firm supporter of the notion of OSINT For Good, my promise is that wherever possible, Perspective Intelligence will always support such investigations, and will provide as much support to law enforcement as we can reasonably afford. Whether from advice on tradecraft to providing data, we’re firm believers of doing the right thing and supporting the effort to stop child abuse and human trafficking.
I hope you’ve enjoyed this more targeted deep dive into a topic that combines a bit of everything – Manual collection, automated research and genuine villains.
Until the next one.