Honeypots, Hackers and SIEMs, Oh My! - Part 2

Honeypots, Hackers and SIEMs, Oh My! - Part 2

If I recall correctly, we left off right about here:

latitude:49.84441,longitude:24.02543,destinationhost:honeypot-vm2,username:admin,sourcehost:xxx.xxx.xxx.xxx,state:null, country:Ukraine,label:Ukraine - xxx.xxx.xxx.xxx,timestamp:2022-07-28 15:37:
latitude:49.84441,longitude:24.02543,destinationhost:honeypot-vm2,username:admin,sourcehost:xxx.xxx.xxx.xxx,state:null, country:Ukraine,label:Ukraine - xxx.xxx.xxx.xxx,timestamp:2022-07-28 15:37:50
latitude:49.84441,longitude:24.02543,destinationhost:honeypot-vm2,username:morefailure,sourcehost:xxx.xxx.xxx.xxx,state:null, country:Ukraine,label:Ukraine - xxx.xxx.xxx.xxx,timestamp:2022-07-28 16:02:525
        

For those who did not read the inspired genius that was my first post, I'll summarize what this wall of text means.

  • I created an Azure virtual machine running Windows 10 and reconfigured the firewall protocols on it, making it as easy as possible for people on the internet to find.
  • I used a custom Powershell script on the virtual machine to send the IP addresses of any third-party login attempts through a geolocation API. (The categories in the above snippet are what I'll be working with in my Azure custom logs.)
  • I verified that my script was communicating with my Azure Log Analytics Workspace by generating several fake login attempts (spoiler alert - it was).

Now all that I had to do was configure my custom logs, pass them to Sentinel and plot the results on a map.

The second half of the project was more streamlined in several ways, requiring more attention to detail and less "welp the original tutorial isn't relevant anymore, figure it out" than the first half.

After I had launched my VM and script, I took a brief break from the tutorial to wait and see where my first attempted breach would come from in real time. A certain childish excitement set in and several minutes went by as my eyes bored into my screen, eagerly anticipating the moment when the first l33t hax0r would fall into my trap.

To my bemusement, the internet's elitest hackers weren't as interested in my honeypot as I had hoped, so I got up to make myself some tea. Of course, by the time I had returned, the unwatched pot had boiled.

No alt text provided for this image

For those of you that didn't bring your magnifying glasses, the first country to try and get into my VM was apparently Finland (or someone working through a VPN wired through Finland).

Now that I was 100% sure that my VM was out on the open internet and attracting attention, it was time to proceed with the tutorial and set up my custom logs.

Part 3 - Log Configuration

Some of you diligent readers may be asking a simple question - "What had to be configured, exactly? All of the information you need it right in the text file!" ...and you'd be partially right.

Take a look at the example logs above. When you, the human, look at a chunk of text such as "latitude:49:84441,", you actually see several things at once without realizing it.

  • a string of text that spells the word "latitude", which you understand to reflect an angular distance used to designate a location north or south of the Equator;
  • a first colon, which you understand comes after the word "latitude" to separate it from a subsequent numerical value;
  • a numerical value that also has a colon in it, but you also know that the colon is actually a part of the number and not a separator (like the first colon);
  • a comma that separates the latitude-related information from any subsequent information (e.g. longitude-related information). In IT lingo, this comma is serving as a delimiter.

Computers make no inherent assumptions about the values of text unless they are taught how to, and that was the next task at hand - turning "xxxxxxxx:xx:xxxxx," into "<category>:<numerical value containing a colon>," so that the LAW correctly parsed the data, then replicating this for each data type I wanted to pass into my fancy-schmancy map.

Let's stick with latitude for an example. I wanted to have a numerical value to use as "latitude_CF" for my map, with "CF" being shorthand for "custom field". In order to do that, I needed my logs to identify and export that piece of information from the raw data supplied by my Powershell script to Sentinel without my assistance.

The first step in teaching the algorithm how to extract the raw data I wanted was opening the raw data.

latitude:60.17116,longitude:24.93265,destinationhost:honeypot-vm2,username:AZUREUSER,sourcehost:95.217.117.163,state:South Finland,label:Finland - 95.217.117.163,timestamp:2022-07-29 12:55:13        

The next step was highlighting the numerical value for latitude that I wanted to extract (60.17116) and designating its data type. In other words, I told the algorithm that I expected to receive a number as output.

No alt text provided for this image

At this point, the LAW algorithm would take my example value, parse a bunch of other log values on its own and return what it thought was the correct values.

No alt text provided for this image

As the above example shows, the algorithms did pretty well with numerical values that a) came after a colon, b) contained a colon and c) had a delimiter (,) after them. I scrolled through the search results and confirmed that the algorithms had, in fact, identified the exact data points I wanted to extract.

This process then had to be repeated for every single data field I wanted to export to Sentinel, and the algorithms predictably needed a bit of help with certain values.

No alt text provided for this image

(The black boxes were my computer's IP address, which I've masked for obvious reasons).

For example, the algorithm inexplicably thought I wanted longitude when I was trying to extract the destination host.

"Countries" and "labels" with spaces in the names (such as "United States") also required a bit of extra editing, no bother, really. Just a bit of attention to detail over time and voila - a short time later, I had my custom fields defined!

I alt-tabbed back into my VM to see how things were going, and sure enough, the login attempts were now coming in by the hundreds! It was time to put my newly minted custom logs to the test.

First, I ran a query in the "Logs" section of my LAW. What I was hoping to see was data that had been extracted correctly (no empty values), categorized correctly (no longitudes where there should be latitudes, etc.) and parsed in real time.

It took a few tries and some tweaking with the query, but ultimately I got what I was looking for:

No alt text provided for this image

There were lots of attempts from a user called "的用户帐户" as well - Google Translate tells me that this means "user account" in Chinese.

Now that the LAW was identifying and extracting the information I needed it to through custom logs, I was ready to finally move on to creating my map!

Part 4 - Mapping the Data

In many ways, this was the most straightforward section of the entire project. In exactly one way, it was the least straightforward section of the entire project.

Mystery Alert: there is one issue that I wasn't able to solve (but was able to troubleshoot), so maybe you computer-savvy folks out there can let me know what exactly happened in the comments.

I leaned on Josh Madakor a bit for this part, as he had written a logs query that allowed the custom logs data to be plotted on a map in Sentinel.

FAILED_RDP_WITH_GEO_CL | 
summarize event_count=count() by sourcehost_CF, latitude_CF, longitude_CF, country_CF, label_CF, destinationhost_CF | 
where destinationhost_CF != "samplehost" | 
where sourcehost_CF != ""        

I get strong SQL vibes from these queries, by the way. Anyhoo, the mystery of this query is that it absolutely refused to generate a map and the resulting error message was so vague as to be un-Googleable.

I tried all the standard troubleshooting approaches - created a new map, verified that the other aspects of my data pipeline were working, etc. Nothing.

After a bit of messing around with the query itself, I found the source of the trouble - <where sourcehost_CF != "">

The purpose of the above line is to exclude data where the sourcehost (the IP address of the attacker) has been left blank or hasn't been provided. For whatever reason, this line prevented the map from opening as it should have, and I haven't the slightest idea why.

In any case, I just removed the line and went forward with my mapping. And wouldn't you know it, at long last Sentinel did exactly what I wanted it to.

No alt text provided for this image

At this point, the majority of my decisions were now just aesthetic in nature, but I settled on the following:

  • the size of a circle on the map would correlate to the event count of failed login attempts (bigger circle = more attempts);
  • the "legend" of the map would display attacker IP addresses and the number of times they attempted to log in;
  • green is a nice color;
  • the map should refresh every five minutes;

And that was that! The only thing keeping me from tracking all of the internet's attempts to break in to my honeypot at this point was the geolocation API, which would only run for free 2,000 times every day.

If 2,000 times sounds like a lot to you, you clearly have never seen a China-based server try to brute force their way into something that you own.

No alt text provided for this image

I also decided to use latitude/longitude rather than the more general dot placement that Sentinel was using for countries. That way I could see the difference in multiple servers based out of the same country reflected on my map.

Conclusion

I loved this project for several reasons:

1) I got to actually work with several Azure services and learn how to navigate, configure, customize, all that good stuff.

2) I got to troubleshoot and solve problems on my own, which is exactly what I would be doing if I were paid to track these security events in an Azure-based SIEM. Going off-road a bit meant I had to really understand the content of the tutorial and Azure itself in order to get everything to work, and that's good practice.

3) It gave me my first real look into Powershell scripting.

4) It was totally free! (No, really, you can replicate this yourself and not pay a dime.)

Briefly coming back to Point 2, I also compiled the changes to the original tutorial and posted them in a comment on the YouTube video itself. It was excellent practice for me and hey, why not help a brother/sister out who's also trying to learn something new?

Within 24 hours, I was more than pleasantly surprised to receive a healthy helping of sweet, sweet validation.

No alt text provided for this image

I'm taking that to mean I did something right.

Thanks for reading! As always, this post will be up on my blog as well. Stop over if you like reading the same things twice (but in a different font!) or need another Twitter account to follow.

- B

要查看或添加评论,请登录

Ben Stewart的更多文章

  • Things People Ask Me: Part 1

    Things People Ask Me: Part 1

    I often get surprise from people on calls when they find out that I’m based out of Western Ukraine. Anyone that knows…

  • Building an Active Directory Home Lab with VirtualBox

    Building an Active Directory Home Lab with VirtualBox

    Warning: the following content may contain excessive use of acronyms, technical jargon, descriptions of basic…

  • Honeypots, Hackers and SIEMs, Oh My! - Part 1

    Honeypots, Hackers and SIEMs, Oh My! - Part 1

    Every week, I dedicate a few days to completing some sort of pet project that allows me to get a more practical…

    2 条评论
  • Cybersecurity Plugins For WordPress: A Comparative Review

    Cybersecurity Plugins For WordPress: A Comparative Review

    Note: this is an entry from a series of articles I previously wrote on security optimization and cybersecurity best…

    1 条评论
  • Web Scraping with Python & Beautiful Soup ...for Translation Work?

    Web Scraping with Python & Beautiful Soup ...for Translation Work?

    "But wait..

  • CAT Tools - What Are They Good For?

    CAT Tools - What Are They Good For?

    (Other than turning anything on the internet into click bait) Translating is one of the most diverse and intellectually…

  • Why Native Speakers are (99.99%) Irreplaceable

    Why Native Speakers are (99.99%) Irreplaceable

    Temptations abound in our professional lives, and one of the easiest traps to fall into nowadays is online machine…

    2 条评论
  • Pipeline Blues, Part 1

    Pipeline Blues, Part 1

    It is common knowledge that any incoming presidential administration faces considerable challenges in the earliest…

    8 条评论
  • What (Good) Translators Actually Do

    What (Good) Translators Actually Do

    One of the most essential skills required to be a good translator is also one of the trickiest to master, and that is…

  • Rising from the Ashes

    Rising from the Ashes

    The Man and the Lion, from MIT’s Internet Classics Archive A Man and a Lion traveled together through the forest. They…

    1 条评论

社区洞察

其他会员也浏览了