Getting Started in OSINT
Background
I began working in Open Source Intelligence (OSINT) around 10 years ago. A lot has changed since, some of the major things being:?
While the privacy features are good for users of the social networks, as investigators it makes your job slightly more difficult. Thankfully, the people we are investigating often spread their data across several social media platforms, leaving multiple breadcrumbs that allow us to build a full profile on them.?
Who conducts OSINT investigations?
When I started, most of the people I trained worked in an intelligence capacity and used their skills to gather information on people acting dishonestly.?
Over the years as intel teams have realised the wealth of information available at their fingertips, the type of people now looking to OSINT as a vital tool in their day-to-day business is expanding.?
People dealing with cyber threats, revenge porn, child exploitation material, serious fraud, counter terrorism, white supremacists, the list goes on. All it takes is one operational security misstep from the person of interest and the intel analyst can start to build a stronger case against them?
Nicky Hagar recently gave an inspirational talk at Dataharvest, the European Investigative Journalism Conference, and highlighted the big issues needing urgent and lasting media attention. If you want to use your OSINT skills for investigative journalism I recommend you read his keynote.
OSINT tools and techniques should always be used for good, please don’t use them to stalk your ex, current or future love interest.
What fundamental knowledge do I need?
While not essential, a good grounding in internet fundamentals is important, understanding what an IP address is, how DNS & SMTP work and how to register a domain name will set you in good stead. The better your computer skills, the faster and more actionable intelligence you are able to obtain.?
Cross-domain knowledge
What are the major OSINT domains?
There is an ever increasing number of domains of knowledge that you should be aware of when collecting OSINT. Each one of these is a deep rabbit hole that you can venture down, some may not be applicable for your role, so choose wisely.?
Maybe start with two or three topics and work your way up from there. I recommend OPSEC, gathering data from SOCMINT sources and using search engines like a pro
Operational Security (OPSEC)
This is a massive topic in itself, The Grugq has an excellent online course which can teach you everything you need to know.
Procedures for evidence collection
The Association of Chief Police Officers (ACPO) provide the “Good Practice Guide for Digital Evidence” guide. While its over 10 years old (and probably due an update), it still provides some sound principles:
A strategy should be developed and plugged into your OSINT process.
Choosing the right social media platform for the country you are targeting is important, Alexa can help with this by showing you which social network is the most popular in each country.?
Select a name & nationality that matches the network / group you will be interacting with and won’t raise suspicion. Ensure that the name doesn’t match an employee in your company, or the target's (yes this does happen).
There is no excuse these days for stealing a photo from a modelling or corporate website for example and re-using it for your sock puppet. There are tons of AI generated photo websites that let you choose race, sex, age and eye colour among other traits.
Ensure you backstop your profile appropriately, the depth and breadth of your cover will depend on how deeply it will be investigated. Read up on detecting fake profiles to ensure you are not making common mistakes.
Gathering Data from SOCMINT Sources (Social Media)
Using web-based tooling lowers the barrier to entry, there is no computer science / programming skills required. Obviously, the data you enter into the website is visible to the site's owner however, which may raise privacy concerns.
There are literally thousands of OSINT data gathering websites.? One of the techniques I use to find the best ones is to select a social network for example, look at all the sites you can use to gather data about it, clicking on all the links and performing a quick assessment of each tool to determine:
This allows you to?
The following websites have a huge list of resources that you could spend months trawling through - good luck!?
Bellingcat’s Online Investigation Toolkit is also a great place to start.
As with the websites, there are thousands of different tools available. Unfortunately the barrier to entry for downloading, filing the necessary dependencies and getting the tool running is often a lot more onerous than just browsing to a website.?This is where your computer science domain knowledge comes in.
One of my favourite tools however, which doesn’t require a large degree of technical skills is Maltego. If you tie this with the Social Links transform (requires a license + Maltego Classic) you can really start to dig into the relationship between individuals across a variety of social networks.
Using search engines like a pro
Being able to efficiently use a search engine is really important to narrow down search results, particularly when you are searching for fairly common names.?Techniques I use are:
Run the same search query across multiple engines, often the results will vary wildly, for example try the following:
Crypto currencies?
Crypto is the cyber criminals currency of choice, understanding the fundamentals is important before you start an investigation. Make sure you understand:?
Encryption and encrypted messages?
PGP is often necessary for communicating with people on the Darknet. Understanding public/private key pairs and how to encrypt messages is crucial.?
Sometimes a target's surface-web email address or alias will be present within key metadata, understanding how to extract this information can be helpful in an investigation.
The image analysis space has advanced massively in recent years, mainly due to the integration of AI tooling.
Reverse image search is your first port of call and super simple to use:
Image metadata analysis (EXIF Data) can help identify:
Enhancing images like they do in the movies using Remini
Identifying images that have been modified using:
Identifying images based on where the photo was taken, commonly known as GEOINT, this is based on the investigators knowledge of:
领英推荐
Even determining the time of day a photo was taken by analyzing cast shadows using tools like suncalc.
Mobile emulation?
Mobile emulation lets you access social networks that are only accessible via a mobile phone, such as telegram. Emulation options include
With an emulator you can
There are considerations however around receiving SMS messages, particularly for account sign-up, for this you can either use Twilio or subscribe to a GenyMotion business account.?
Network Discovery
Sometimes an individual might own a business that you are investigating, which will likely have a domain name associated with it.
Identification of domains & sub-domains associated with the business can be done through websites (quiet), brute-force (noisy), certificate transparency logs and the web archive.
Historic DNS records allow you to identify origin servers and systems that have been decommissioned from DNS, but that might still serve content if accessed directly.
There are a variety of sites that provide precomputed port scans of servers, you provide an IP address or domain name and they will return results for you:
Whois History allows you to identify data that has since been masked (for privacy reasons) including:
Natural Language Processing
NLP is useful for automatic classification of text collected from OSINT sources, the main areas to read up on are:
The folks at OpenAI have also got some pretty impressive processing models for text (and generating ludicrous images) available.
Darknet monitoring
Depending on who you are investigating, they may have a presence on the darknet. Familiarisation with the TOR Browser Bundle and which of the various drug markets, vendor shops, card shops, fake ID marketplaces or hacking forums?are currently operational is useful. The darknet is a volatile place with websites either exit-scamming or being shutdown by the feds regularly.
One of the key things to look for is profile contamination, that's where the target of your investigation has used the same alias on the surface and darkweb, potentially allowing you to glean important information about them.
In order to process large volumes of information, scraping (automated harvesting) may become necessary. There are a few levels to this, starting with the most simplistic and moving into quite advanced topics:
You should also take into consideration proxy rotation to avoid hitting IP limits, and utilising VPNs to avoid any country blocks.
There are several scraping web services (scrape.do, ParseHub, Scraper API)?available to automate the proxy rotation and geolocation services for you.
When scraping content it's likely you will come up against a CAPTCHA (Turing test). There are two approaches to solving CAPTCHA
Image classification
Using Google Vision AI to automatically classify an image can help extract important information when processing large amounts of data.?
A quick note on bias & manipulation when using AI to classify images:
Is there a process I should follow??
It’s worth getting together with your team and developing a strategic OSINT plan to determine:
Once your data has been collected and your report produced you can move onto the next target, making sure you take note of:
Want to learn more?
Social Media?
One of the best ways to keep on top of the new techniques is Twitter, there are several accounts that tweet out concise, awesome tips on a daily basis. I've created a list of the top OSINT accounts to follow.
Podcasts?
The main podcasts I listen to are The World of Intelligence by Janes and the OSINT Curious Project, however you will find a whole bunch of new and interesting OSINT podcasts out there.?
YouTube?
Because the content changes so rapidly, looking at up to date material is super important, ensure any videos you are watching have been uploaded in at least the last year.?
I found the following video recently which provides a great introduction: Open-Source Intelligence (OSINT) in 5 Hours - Full Course - Learn OSINT!
CTFs
There are a bunch of awesome OSINT / GEOINT CTF’s out there - if you want to see what's possible in the GEOINT space, check out how quick Rainbolt can identify a location from looking at a still image.
Hacker conferences will often have a section of their CTF dedicated to OSINT & GEOINT challenges, team up with a group competing and help them complete those challenges.?If you want to practice beforehand give the following a go:
Training
ZX Security provides training tailored to your organisation and your particular objectives. Whether you work in the blue team and want to monitor the darkweb for signs of data breaches or user credentials, or you need help honing your investigative skills - we can help.?
Worthy Causes
If you want to practice your OSINT skills helping out a super worthy cause, check out the following initiatives.?
Finally, a recruiting agency called OSINT Jobs recently appeared on the internet, dedicated to placing people in OSINT-related work. If your country isn’t listed, try searching for “OSINT” or “Intelligence Analyst” on your local job website.
CISO @ CyberGrape | vCISO
1 年This is a great article Simon!
?? Taking notes! Thanks for the great write up and overview.
Founder at Blacklock & Security Simplified | 2 x NZ Innovation Awards Winner 2024
2 年Perfecto ?? Simon Howard
CISA, Deputy Director of Vietnam Cybersecurity Emergency Response Team / Coordination Center (VNCERT/CC)
2 年Great information with your experience! Thank you Simon.