Getting Started in OSINT
Image generated by DALL-E using the term "Open Source Intelligence"

Getting Started in OSINT

Background

I began working in Open Source Intelligence (OSINT) around 10 years ago. A lot has changed since, some of the major things being:?

  • Better tooling - from custom python scripts through to fully blown applications
  • Increasing volumes of information explaining how to gather OSINT - there are so many blogs, videos, tweets and training courses available.
  • Social networks, in particular Facebook (Meta) and Twitter restricting what you can obtain via their APIs - these networks used to be almost completely open, the amount of information you could obtain was astounding. Obviously due to privacy concerns and abuse, access is now much more restrictive.

While the privacy features are good for users of the social networks, as investigators it makes your job slightly more difficult. Thankfully, the people we are investigating often spread their data across several social media platforms, leaving multiple breadcrumbs that allow us to build a full profile on them.?

Who conducts OSINT investigations?

When I started, most of the people I trained worked in an intelligence capacity and used their skills to gather information on people acting dishonestly.?

Over the years as intel teams have realised the wealth of information available at their fingertips, the type of people now looking to OSINT as a vital tool in their day-to-day business is expanding.?

People dealing with cyber threats, revenge porn, child exploitation material, serious fraud, counter terrorism, white supremacists, the list goes on. All it takes is one operational security misstep from the person of interest and the intel analyst can start to build a stronger case against them?

Nicky Hagar recently gave an inspirational talk at Dataharvest, the European Investigative Journalism Conference, and highlighted the big issues needing urgent and lasting media attention. If you want to use your OSINT skills for investigative journalism I recommend you read his keynote.

OSINT tools and techniques should always be used for good, please don’t use them to stalk your ex, current or future love interest.

What fundamental knowledge do I need?

While not essential, a good grounding in internet fundamentals is important, understanding what an IP address is, how DNS & SMTP work and how to register a domain name will set you in good stead. The better your computer skills, the faster and more actionable intelligence you are able to obtain.?

Cross-domain knowledge can also be really powerful, having expertise in a field as an investigator for example can be paired with some technical and statistics skills to allow you to operate at your full potential. The following Venn diagram is an adaption where “computer science” has been replaced by the more apt “hacking skills”?

No alt text provided for this image

What are the major OSINT domains?

There is an ever increasing number of domains of knowledge that you should be aware of when collecting OSINT. Each one of these is a deep rabbit hole that you can venture down, some may not be applicable for your role, so choose wisely.?

Maybe start with two or three topics and work your way up from there. I recommend OPSEC, gathering data from SOCMINT sources and using search engines like a pro as a good starting point.

Operational Security (OPSEC)

This is a massive topic in itself, The Grugq has an excellent online course which can teach you everything you need to know.

Procedures for evidence collection

The Association of Chief Police Officers (ACPO) provide the “Good Practice Guide for Digital Evidence” guide. While its over 10 years old (and probably due an update), it still provides some sound principles:

  • Minimise evidence contamination
  • Know what you are doing
  • Document everything
  • Be responsible?

A strategy should be developed and plugged into your OSINT process.

Maintaining sock puppet accounts?

Choosing the right social media platform for the country you are targeting is important, Alexa can help with this by showing you which social network is the most popular in each country.?

Select a name & nationality that matches the network / group you will be interacting with and won’t raise suspicion. Ensure that the name doesn’t match an employee in your company, or the target's (yes this does happen).

There is no excuse these days for stealing a photo from a modelling or corporate website for example and re-using it for your sock puppet. There are tons of AI generated photo websites that let you choose race, sex, age and eye colour among other traits.

Ensure you backstop your profile appropriately, the depth and breadth of your cover will depend on how deeply it will be investigated. Read up on detecting fake profiles to ensure you are not making common mistakes.

Gathering Data from SOCMINT Sources (Social Media)

Using web-based tooling lowers the barrier to entry, there is no computer science / programming skills required. Obviously, the data you enter into the website is visible to the site's owner however, which may raise privacy concerns.

There are literally thousands of OSINT data gathering websites.? One of the techniques I use to find the best ones is to select a social network for example, look at all the sites you can use to gather data about it, clicking on all the links and performing a quick assessment of each tool to determine:

  • Whether it’s still live or not
  • What information it collects
  • How fast is it
  • How stable it is

This allows you to?

  • Stay fresh to new tools and techniques
  • Whittle down a list of useful resources for the various social networks?

The following websites have a huge list of resources that you could spend months trawling through - good luck!?

Bellingcat’s Online Investigation Toolkit is also a great place to start.

As with the websites, there are thousands of different tools available. Unfortunately the barrier to entry for downloading, filing the necessary dependencies and getting the tool running is often a lot more onerous than just browsing to a website.?This is where your computer science domain knowledge comes in.

One of my favourite tools however, which doesn’t require a large degree of technical skills is Maltego. If you tie this with the Social Links transform (requires a license + Maltego Classic) you can really start to dig into the relationship between individuals across a variety of social networks.

Using search engines like a pro

Being able to efficiently use a search engine is really important to narrow down search results, particularly when you are searching for fairly common names.?Techniques I use are:

  • Google Dorking to find sensitive information that probably shouldn’t have been indexed by a search engine
  • Search refinement using site: inurl: intitle: quotes and modifiers and Boolean operators like OR, AND, AROUND, +- .
  • Cached results for content that has been deleted

Run the same search query across multiple engines, often the results will vary wildly, for example try the following:

  • Google - site:https://tiktok.com intext:drone
  • DuckDuckGo - site:https://tiktok.com intext:drone
  • Bing - site:https://tiktok.com inbody:drone

Crypto currencies?

Crypto is the cyber criminals currency of choice, understanding the fundamentals is important before you start an investigation. Make sure you understand:?

  • Wallets
  • Private Keys
  • The Blockchain
  • Mixing / Tumbling
  • NFTs

Encryption and encrypted messages?

PGP is often necessary for communicating with people on the Darknet. Understanding public/private key pairs and how to encrypt messages is crucial.?

Sometimes a target's surface-web email address or alias will be present within key metadata, understanding how to extract this information can be helpful in an investigation.

Image analysis?

The image analysis space has advanced massively in recent years, mainly due to the integration of AI tooling.

Reverse image search is your first port of call and super simple to use:

Image metadata analysis (EXIF Data) can help identify:

  • Time and date the image was taken
  • Camera make & model
  • Image latitude and longitude (GPS)
  • Technical photo-nerd stuff (Exposure, ISO, shutter speed)

Enhancing images like they do in the movies using Remini

enhanced image using AI

Identifying images that have been modified using:

  • Clone detection
  • Error level analysis
  • Noise analysis
  • Luminance gradient
  • Shadows
  • Thumbnail image
  • Edit history

Identifying images based on where the photo was taken, commonly known as GEOINT, this is based on the investigators knowledge of:

  • Street signs
  • Road markers
  • Sidewalks?
  • Utility poles
  • Buildings & architecture?
  • Trees
  • Soil
  • License plates

Even determining the time of day a photo was taken by analyzing cast shadows using tools like suncalc.

Mobile emulation?

Mobile emulation lets you access social networks that are only accessible via a mobile phone, such as telegram. Emulation options include

With an emulator you can

  • Spoof GPS coordinates, allowing you to appear anywhere in the world?
  • Inject camera & audio?
  • Perform runtime modification of applications
  • Easily access application files, databases & logs
  • Reap the user interface benefits - you can use a keyboard to type!?

There are considerations however around receiving SMS messages, particularly for account sign-up, for this you can either use Twilio or subscribe to a GenyMotion business account.?

Network Discovery

Sometimes an individual might own a business that you are investigating, which will likely have a domain name associated with it.

Identification of domains & sub-domains associated with the business can be done through websites (quiet), brute-force (noisy), certificate transparency logs and the web archive.

Historic DNS records allow you to identify origin servers and systems that have been decommissioned from DNS, but that might still serve content if accessed directly.

There are a variety of sites that provide precomputed port scans of servers, you provide an IP address or domain name and they will return results for you:

Whois History allows you to identify data that has since been masked (for privacy reasons) including:

  • Contact people
  • Email addresses

Natural Language Processing

NLP is useful for automatic classification of text collected from OSINT sources, the main areas to read up on are:

  • Information extraction
  • Summarisation
  • Relationship extraction?
  • Sentiment analysis
  • Author Identification

The folks at OpenAI have also got some pretty impressive processing models for text (and generating ludicrous images) available.

Darknet monitoring

Depending on who you are investigating, they may have a presence on the darknet. Familiarisation with the TOR Browser Bundle and which of the various drug markets, vendor shops, card shops, fake ID marketplaces or hacking forums?are currently operational is useful. The darknet is a volatile place with websites either exit-scamming or being shutdown by the feds regularly.

One of the key things to look for is profile contamination, that's where the target of your investigation has used the same alias on the surface and darkweb, potentially allowing you to glean important information about them.

Automated harvesting & processing of content

In order to process large volumes of information, scraping (automated harvesting) may become necessary. There are a few levels to this, starting with the most simplistic and moving into quite advanced topics:

  • Level 1 - Command Line (e.g. wget + grep)
  • Level 2 - Scripted (e.g. Python BeautifulSoup, scrapy)?
  • Level 3 - Browser driven (e.g. Selenium)

You should also take into consideration proxy rotation to avoid hitting IP limits, and utilising VPNs to avoid any country blocks.

There are several scraping web services (scrape.do, ParseHub, Scraper API)?available to automate the proxy rotation and geolocation services for you.

When scraping content it's likely you will come up against a CAPTCHA (Turing test). There are two approaches to solving CAPTCHA

Image classification

Using Google Vision AI to automatically classify an image can help extract important information when processing large amounts of data.?

No alt text provided for this image

A quick note on bias & manipulation when using AI to classify images:

  • If there is a bias in the training data it will make it's way through the model
  • The people selecting the training sets will impart their biases and the model will include that

Is there a process I should follow??

It’s worth getting together with your team and developing a strategic OSINT plan to determine:

  • OPSEC - How you are going to stay safe when conducting an investigation
  • Target Selection - How you are going to select your targets
  • Deployment - What information you will collect and where you will collect it from, analyse networks of interest and determine the data points available. The following diagram from the OSINT Dojo shows the information you can gather from LinkedIn for example?

No alt text provided for this image

  • Execution - Where will you store the collected information? How will you present your findings? Remember the goal of OSINT is to collect actionable, timely intelligence for dissemination to support decision making.

Once your data has been collected and your report produced you can move onto the next target, making sure you take note of:

  • Lessons learnt
  • Tools used
  • Relevant data sources
  • Consider how your profile was used and whether it has been burned. Start the process of creating new profiles if required

Want to learn more?

Social Media?

One of the best ways to keep on top of the new techniques is Twitter, there are several accounts that tweet out concise, awesome tips on a daily basis. I've created a list of the top OSINT accounts to follow.

Podcasts?

The main podcasts I listen to are The World of Intelligence by Janes and the OSINT Curious Project, however you will find a whole bunch of new and interesting OSINT podcasts out there.?

YouTube?

Because the content changes so rapidly, looking at up to date material is super important, ensure any videos you are watching have been uploaded in at least the last year.?

I found the following video recently which provides a great introduction: Open-Source Intelligence (OSINT) in 5 Hours - Full Course - Learn OSINT!

CTFs

There are a bunch of awesome OSINT / GEOINT CTF’s out there - if you want to see what's possible in the GEOINT space, check out how quick Rainbolt can identify a location from looking at a still image.

Hacker conferences will often have a section of their CTF dedicated to OSINT & GEOINT challenges, team up with a group competing and help them complete those challenges.?If you want to practice beforehand give the following a go:

Training

ZX Security provides training tailored to your organisation and your particular objectives. Whether you work in the blue team and want to monitor the darkweb for signs of data breaches or user credentials, or you need help honing your investigative skills - we can help.?

Worthy Causes

If you want to practice your OSINT skills helping out a super worthy cause, check out the following initiatives.?

  • FACT Aotearoa collect resources to help you and your loved ones Fight Against Conspiracy Theories.
  • SMAT App analyse hate and disinformation online. New social networks are popping up regularly, particularly in the extremist area and SMAT is a great place to keep an eye on what's new. Their tooling is open source and you can support them via the open collective
  • Belling Cat has open sourced a bunch of their tools and often asks for collaboration on projects & tools, they also have a list of open questions for technical?contributors.?
  • Trace Labs work to identify the location of missing persons using crowd-sourced OSINT challenges.
  • Europol have an online platform which displays objects that are all taken from the background of child sexual abuse images. They ask for help from the public to identify the images that might help crack a case.

Finally, a recruiting agency called OSINT Jobs recently appeared on the internet, dedicated to placing people in OSINT-related work. If your country isn’t listed, try searching for “OSINT” or “Intelligence Analyst” on your local job website.

Niel Chapman

CISO @ CyberGrape | vCISO

1 年

This is a great article Simon!

回复

?? Taking notes! Thanks for the great write up and overview.

Nilesh Kapoor

Founder at Blacklock & Security Simplified | 2 x NZ Innovation Awards Winner 2024

2 年

Perfecto ?? Simon Howard

Nguyen Nguyen Huu

CISA, Deputy Director of Vietnam Cybersecurity Emergency Response Team / Coordination Center (VNCERT/CC)

2 年

Great information with your experience! Thank you Simon.

要查看或添加评论,请登录

Simon Howard的更多文章

社区洞察

其他会员也浏览了