Putting Out Cyber Fires: A Conversation With Mjolnir Security
Incident response (IR) involves the procedures and technologies that an organization uses to identify and react to cyber threats, security breaches, or cyberattacks. Having a structured incident response plan allows cybersecurity teams to mitigate or avoid potential damage.
Incident response is quite the large topic to cover, so we brought in an expert to help simplify some things. Milind Bhargava, the founder and CEO of Toronto-based Mjolnir Security, sat down with iPSS to talk about his cyber journey, the many services his company offers, and some of the challenges he’s faced when responding to cyber incidents.
*This interview has been edited for length and clarity.
Milind, thank you very much for taking the time to talk to us here at iPSS. I would like to start off by asking a little bit about you. What’s your background and how did you get into offering the cyber services Mjolnir Security offers today?
I've got now about 15 years of experience in cybersecurity and I've done pretty much everything across the board. My first job was pen testing.
Over time I tried to learn more about cybersecurity in general. I've always wanted to be a pen tester, but over time I realized I don't have the patience to be a pen tester. I have the patience to do other things that are a little more difficult and complicated in general, so that's where somebody introduced me to incident response.
Incident response, in its basis: every time we go to a new client, it's a new problem. It's a new set of tools, new technology stack, new everything in general. So, to figure out how somebody got hacked, with little to no information in most cases, and write the story back to identify how it started to how it ended—that always kept my interest.
Now that we know a little bit about your background, can you tell us about Mjolnir Security?
When I began Mjolnir, I used to work for another company. Part of my role in that company was to help onboard new clients for SOCs, SIEMs, those kinds of things.
One of the problems that I noticed in the industry was, for example, if you're trying to send logs from a client as part of a managed service, you have to buy hardware, then you have to work on shipping those logs, and so on. There's a lot of complicated infrastructure to be set up, and then if you’re transferring petabytes of data, it's going to take that amount of storage on both ends to make the transfer.
I figured there has to be an easier way of doing this.
I figured out a way, using some existing tools and some tools that I built over time, how I can technically deploy a SOC in under 30 minutes—almost like ordering a pizza.
Now it's in the cloud.
Yes, a lot of other people also have SOC in the cloud, but where it actually made a difference in how Mjolnir came to be this popular is that we use it for incident response. Every time we get a call from an IR, we'll start deploying our tools.
Essentially, we're deploying a SOC for every client in those 30 minutes.
We’re able to take insane amounts of data, parse it, and have an answer ready for a client within a few hours, saying this is what has happened, this is what was going on, and so on. Any other company can take potentially weeks or months.
Can you walk us through the services you offer in relation to incident response?
Let’s start with the proactive side.
We do things like incident response playbooks or tabletop simulations. Both of these are essentially asking organizations, in case a “cyber fire” breaks out, what is your plan for today?
Typically, every organization medium-sized or larger will have a laminated placard on their desk saying, in case of fire or in case of chemical emergency or something else, dial this number, go to this floor, and so on.
We build the same for the cyber side of things.
Let's say you have ransomware. What is the first step? What do you do as a second step? Who do you reach out to? In what order do you reach out?
In case something happens, you just open the playbook and you’re like, “Oh, step one, step two, 2-A, 2-B”, and so on. You will know exactly what to do.
The tabletop simulation is taking the incident response playbook, putting everyone in the same room, and saying, “Okay, C-suite just got informed that they clicked on ransomware. Now, what are you going to do?” Each team, based on the playbook, will do a reaction.
As part of this, we also have a service called Cyber War Gaming. What we do for this is we involve the same people that are involved in the tabletop and the playbook and we have them connect to a virtual environment. We provide custom virtual machines in a VPN network to connect to our lab and we have actual people from the C-suite running actual malware.
The C-suite at that point becomes something like the red team. The rest of the staff that was invited to the tabletop also have their own virtual machines to connect to the same environment, and that's what they're using to then go and react to all the threats they're seeing.
How will they contain the threat? Who will they reach out to? Will they call the C-suite again?
Clients are happy they’re able to test this out in a simulated environment versus their own production environment. If things actually happen, they know exactly what they can fix to get better versus actually having an incident and they’re on fire.
That’s how you figure out how to get better.
The last offering on the proactive side will be the SOC as a service.
What about the reactive side?
Things go wrong, you’re on fire, you call us, and we'll jump right in. Our job is to identify where the threat actor is. If they're still in the network, kick them out. Essentially, find out what they have taken, how long they were there, what specifically did they access.
And why is this important?
If there's very limited client data that was potentially leaked or stolen, then they only have to notify those individuals versus notifying their entire client list, which also impacts their own branding and everything else as they’re trying to reach out.
Plus, when they offer ID monitoring services, it's going to cost them more money. The less people impacted means fewer notifications and smaller costs.
领英推荐
Can you describe a few of the tools and technologies that Mjolnir Security would typically use when responding to an incident?
When we go in, the first task that we're trying to do is send logs out of the client environment and into our environment.
There's a tool called Sumo Logic, which is also the backbone of our entire infrastructure. We use that to deploy agents that will then collect all the logs that we need, specifically based on our requirements.
Once the logs are in our cloud, we'll pass them through our SOC, SIEM, and so on. We also use SOAR to increase the automation and to actually get our answers faster. Then, we will try to connect all the client infrastructure to our cloud infrastructure with Sumo Logic directly, so we can pull the logs from there as well.
Any agent we deployed will collect logs from that point onward but not the historical ones. One of the tools that we use here is called Magnet Forensics. It's a tool that is used to essentially do forensic work and it allows us to pull up to 90 days’ worth of logs from all the tools without much effort.
If you did that manually, this would take you three to four days per user. But if you do it through Magnet, it might take three to four hours per user.
On one end, we’ll have log analysis going on with Sumo Logic and all our other tools we have.
Then for the endpoint, we also use an EDR, MDR, or XDR based on what purpose we're using it for. Our preferred one is SentinelOne.
Typically, when we go into an organization, we'll tell them we want to deploy our own EDR tool and they will say they already have one. I always like to tell them at that point: if yours was working, you would not have had the need to call me.
We’ll go in and deploy SentinelOne, at least to have a second tool in there. Then as we start going through all the devices and logs, the moment we find the malware, we'll take it and send it to our malware analysis laboratory. We're completely running in the cloud for that—we’re actually running it on a Windows VM.
After the Windows VM runs it, we'll collect the memory dumps and all the other artifacts that have been generated, run it against another tool called Volatility, and then get a string analysis to generate what is called a Yara rule. The Yara rule is like an antivirus signature.
We'll take this Yara rule and the hashes and other artifacts generated, put them in SentinelOne on one end, and then we have another tool called Thor. In Thor, we'll take the Yara rule and all the hashes and so on, and we'll deploy those in the client environment to again scan all the devices.
EDR tools will take their time because they're doing a thorough and comprehensive analysis of the hard disk and every file out there. With Thor, we’re specifically hunting for the malware file.
That sounds like it would take a lot of time. How long does the entire process typically take?
We've got a lot of tools with integrations and automations built in. At this point we have done more than 500 IRs and we have seen every technology stack that exists and how it can fail in every spectacular fashion.
So, after going through each IR, we built a procedure or what we call parser for every tool, every technology, everything that can exist, and now we have got an automation for all of them.
Now when we do IRs, it takes us almost no time. Our fastest IR, from getting into and identifying how the compromise happened: three-and-a-half hours—and they had more than 5000 computers.
Is almost everything automated for you guys, or are there any manual processes left?
Report writing. That’s completely manual.
Any IR report we make for a client will start with the technical people. Then, it will go to the liaison, who will confirm that this is exactly what has happened. Eventually, it will come to my table, I’ll review it, and I’ll ask questions or underline parts and say “this isn’t right” or “explain this better”.
The quality of the report determines how good we are to the client. It’s essentially my name because, as the owner of the company, if we are sending our report, it is something that I’ve certified to be true.
What was one of the most challenging incidents Mjolnir Security has responded to?
One was over Covid, which actually helped us improve a lot of our remote capabilities. The client was spread over 26 countries. Imagine Covid, where there is a lockdown happening, and we have to support 26 countries’ worth of locations.
We had to figure out how to get forensic clones of devices that have dial-up Internet. We’re looking at remote sites with barely any working satellite Internet. We had to talk to people who are essentially mechanics, technicians, and other non-computer folks to help us get the hard drives that we need, ship them new hard drives, and configure things.
Half our team was only deployed on IT support, just helping the client have better IT. They had close to 100 locations and they had four people in IT, so there’s no way they could have done it on their own. The rest of us were trying to figure out how to do remote data acquisition and all the other things.
The problem with that client was that some of the computers were last patched for the Y2K bug.
Then, some people at a particular site location got concerned after watching Terminator that the computers may become self-aware. They actually wrapped chains around the monitor. The monitor and the CPU were separate and the computer was connected to a lathe machine and it was running Windows 98.
They had a network that was so flat that, if you’re a contractor and you were connected to what is called the DMZ, you could connect to any other device on the network. By default, every user was a local admin, including the contractors, and that privilege also allowed them to access any servers on the network.
How this company survived, we don’t know. And how they did not get impacted earlier, we also don’t know.
To conclude things, what are some of the best practices or recommendations that you would give to an organization to avoid these situations?
At the bare minimum, I recommend at least having MFA on everything. Some people have it on email, some people just don’t want to put MFA—and we have seen that on multiple IRs—people, especially in the C-suite level, will not put MFA because it’s inconvenient. They’ll skip out on things and it becomes a bit of a problem.
The second one I’ll recommend is having some kind of logging mechanism—centralized logging, for that matter. Even if it’s IT-related issues, if you have centralized logging, you can go and check on who did what. If somebody got logged out, if somebody clicked on something, you’ll have some way of identifying all those things.
The third is to have a layered approach for security. What that means is you have an EDR, antivirus, anything on endpoints; you have something on the network; and then you have some kind of firewall that inspects traffic.
Thank you very much Milind for taking the time to speak to iPSS about Mjolnir Security, the proactive and reactive sides of incident response, and everything you offer in relation to this critical service.
Be sure to visit their website or follow them on LinkedIn to stay updated with all the latest happenings at Mjolnir Security.