The challenge of machine name enrichment

The challenge of machine name enrichment

This is the third of six articles doing deep dives on the enrichments that are initially performed in order to perform incident triage.? The first article was about enriching IP addresses.? The second article (which was SUPPOSED to be the third in the series) is about enriching domain names and urls.? And now, today’s article will address the topic I skipped and discuss enriching machine names (instances).?

To catch up anyone who missed the first article and as a reminder to those who read it, analyzing a potential security incident requires analysts to perform three overarching investigative steps: 1) know what questions to ask; 2) know how to get the answers to those questions; and 3) know what to do with the answers returned from their discovery.? We also discussed “artifacts”- those elements that most likely need additional information to determine if an event is malicious or benign.? Finally we gave a definition to “features”- the answers to questions which are used later to make triage and remediation decisions.

In the first article, we referred to the ubiquity of IP addresses- but the challenge with addresses is that humans remember names better than numbers.? So nearly every system will have a meaningful or memorable name to allow humans to better recognize the system.? After all, the meaningful name “svr_accounting” or the memorable name “Asgard” is easier for humans to remember than 24.53.226.227- or even worse, trying to remember the IPv6 address 2001:db8:a0b:12f0::1.

Not every logging or monitoring system will capture the machine names in the event, but having that name will significantly help the humans process the relationships of the devices involved.? As part of the enrichment from IP addresses, one of the questions we tried to determine was “What is the name associated with this address?”.? For those times you *do* have machine names- either directly logged in the event or whether determined from IP enrichment- we are ready to perform enrichment on the machine name/instance.

And to help people avoid the same mistake I made, I feel obligated to provide a qualifier.

I confused myself in my previous article when I started the article describing enrichment of machine names and ended by describing enrichment of domain names.? To help you avoid that same confusion, the distinction I made is a rather personal and somewhat arbitrary distinction- the protocol used to interact with the device.? With domains/URLs, the name resolution of the computers involved uses internet accessible DNS servers.? The greater percentage of enrichment activity will be performed on domain names that we have no control over (e.g. - we didn’t register the domain name).? By contrast, when referring to machine names, I am largely considering machines internal to our network- those machines that have addresses resolved by internal DNS servers, WINS servers or network broadcasts.? The reason I make this distinction is because the enrichment questions are pretty different for each.

Recap and qualifiers made, let’s get started on enriching machine names…

ISSUE 1) What features do I need to extract?? (What questions do I ask?)

So when looking at the machine names associated with an alert, there is often a significant overlap with many of the enrichments performed on IP addresses- but regardless, the analyst’s journey begins with asking “what do I need to know about this machine name?”??

  1. Is the associated computer under our authority or is it outside our control?
  2. If the machine is under our authority, is it managed?

a) Does this machine follow the expected naming convention used by my company?

b) Is the computer name registered in my company's CMDB?

c) Is the computer name assigned in the cloud infrastructure?

d) By name, what is the patch history?

e) By name, does the computer have unresolved vulnerabilities?

f) What is the value of the associated asset?? Is it a sensitive device?? Does it contain access to confidential information?

g) Is the name dynamic?? (e.g. - a temporary instance from a “golden image”)

3. If the computer name is not under your authority, what can be known about it?

a) Is the device network connected inside your perimeter?

b) Is it connected to your guest wireless or corporate wireless?

c) Is it connected to the company LAN wiring?

d) Can anything be known about the type of device?

e) What network services (if any) does the device host?? (e.g. what network ports are open?)?

4. Is the machine name (asset) trusted?

a) Have I observed previous trusted activity from this same machine?

b) Is the asset assigned to a trusted party (e.g. - my red team)?

5. Have I seen previous notable activity from this asset?

a) Authentications?

b) System config changes?

c) Malicious activity/alerts?

d) Is the computer name in previous cases?

e) What is the volume and frequency and schedule of previous activity?

f) Any communications (successful or unsuccessful) with external systems?? Internal systems?

6. Any other associated activity?

a) Users observed interacting with this asset?

b) What are the IP addresses returned when resolving this computer name?


ISSUE 2) Where do I look to answer these questions?? How do I find the answers??

In the first article, we discussed the distance between knowing WHAT you need and knowing WHERE and HOW to get it.? We used the first question as our sample to illustrate this principle.? Since we looked at question 1 already, and question 2 is rather long for this article, let’s explore question 3 and demonstrate how an analyst can know WHERE and HOW to get the information.

Where is the answer?? What can we learn about computer names not under our authority???

Remember the distinction I am making between domain names and machine names- domain names are resolved to IP addresses by internet accessible (external) DNS servers; computer names are resolved internally.

The fact we are looking at computer names (and therefore resolvable by internal systems) means the device has access to your internal network.? For this to happen, it is either on the wireless or the wired network.? If it is on the wireless, it could be on the guest wireless network or it could be on the corporate wireless network.

If you have the NAC (Network Access Control) technology deployed, this would be the first and preferred place to check to determine how the machine connected to the network.? Without NAC, the problem is more difficult but still solvable working with the right team- in this case it will be your internal network/IT staff or network architects.? Someone in the company should understand the IP address ranges used and how they are segmented and allocated.? Most often and ideally, there will be separate subnets allocated for each of the segments.? By working with the team that manages this architecture, we can identify which subnets allow which type of connectivity.? The ideal time to do that is before you need the knowledge and not in the middle of a security incident.

Finally, to learn about the type of device can be a craft unto itself.? One thing I have done in the past is to use DHCP logs to obtain the MAC address assigned.? From there, I can use lookup tables to determine the manufacturer.? Also, since the computer name is internal to my network, I have less reservations about performing a rapid scan of common ports to identify open network ports.

Interacting with security architects is often a bridge too far for junior security analysts.? You’ll likely see more success asking a senior security analyst to get the information and document it in a playbook for level 1 analysts to use.? Similarly, I would not ask a junior analyst to scan an internal device until they have been trained and can follow documented processes in a playbook.? These playbooks- if followed- ensure consistent evaluation of each case.? The most mature organizations will have defined automations to answer these questions saving the investigation time and ensuring consistent adherence to the playbook.? After the features are collected, the case is ready for analysis by a level 2 (L2) analyst.


ISSUE 3) What to do with the answers from discovery?

With enrichments performed from the first two steps, it is time to aggregate and correlate the features (the answers to the questions investigated) to determine if the activity is malicious and warrants remediation, if it is benign and can safely be ignored, or if it is a false positive requiring tuning of the detection or alerting content.

To demonstrate, we’ll continue to use enrichment block #3 as an example.? Let’s use the answers to these questions (features) to make meaningful decisions.

For this article, I’ll start by presuming the device is inside the perimeter since the name is being resolved by internal systems.? In actual real-world application, assumptions can be dangerous and have costly consequences for incident response.

If the device is attached to the corporate network (wired or wireless), follow-up questions for the level 2 analyst might include:

  • Is the device observed a misconfigured device that is not running the expected corporate software?
  • Is it a new device that just hasn’t had setup completed yet?
  • Did a trusted user manage to connect a rogue device to the corporate network (e.g. - their iPhone?)
  • Is the device observed managed by a third party or vendor (e.g. - network printers, security camera, etc.)
  • Is the observed device an IoT device (e.g. - thermostat, lightbulb, etc.)
  • Does the observed device give additional ingress access (e.g. - did someone connect a wireless access point to your physical LAN?)
  • Does it allow a path for exfiltration?? (Is it a network attached storage device? Personal printer?)

If the device is on the guest wireless, the L2 analyst should look for any indicators of communications (to or from) devices on the corporate network.? The only communication I expect and want to see on the guest wireless network is outbound to the internet only.

Answering these questions generally require more investigation than an L1 is ready to take on.? I realize this breaks the mold? and pattern of asking L1 analysts to collect features and L2 analysts to interpret and decision those features so I wouldn’t argue strongly if a person pushed these down for L1 analysts to answer.

Correlating all of these features to make an escalation/remediation decision comes largely from experience.? To be succinct, the ideal outcome of this step is the ability to explain WHY.? Why did this activity occur?? Why is it okay to ignore this activity?? Why is this activity potentially malicious and therefore needs to interrupt business processes by others?? This is not to say that we can presume motive, but to state that we should have justification for the conclusions created.


LET’S TALK SOLUTIONS NOW

So we’ve looked at three issues:?

1) Knowing what to ask.

2) Knowing how to get the answers.

3) Knowing what to do with the answers you found.

The first two issues can be significantly improved through documented playbooks.? After having this documentation, you will need to train security analysts and develop processes to ensure the playbooks are being followed.? If time and resources permit, you might even seek to use automation to ensure consistent investigation is performed and delivered for making security decisions.? Finally, you will need to allocate a certain amount of resources (or build it into your ongoing processes) to maintain this documentation so it does not become outdated.??

The third issue is directly correlated to experience.? Experience here is built on a stool of three legs: 1) near real time feedback on the decisions made (Case QA; was a correct decision made); and 2) recalling that feedback in the future when sufficiently similar scenarios are encountered; 3) identifying and correlating the attributes/features that would make the outcome this time close to the previous decisions.? To battle this, develop a way that L2 analysts can readily identify historically similar activity and readily summarize that knowledge in a distilled presentation so new L2 analysts can benefit from previous experiences of other analysts.

All of these are complicated by employee churn.? Your best case scenario is where L1 analysts advance to L2 analysts and can leverage their previous experiences, but once they move to positions beyond L2, that knowledge is effectively lost.? This “leakage” of knowledge becomes even more significant as L1 analysts churn before advancing to L2.? There is also a constraint that the more knowledge is documented, the more incoming analysts need to read and learn to align with documented processes.


OR… DO ALL OF THIS WITH A VIRTUAL SOC ANALYST

We are building a virtual SOC analyst that can do all of this and more.? We are seeking design partners that have observed these challenges in their own security team and want to see how our virtual analyst performs- and offer feedback about capabilities we need to create for their unique situation.? To learn more, contact [email protected] or contact me on LinkedIn at https://www.dhirubhai.net/in/anthony-morris-securitypro/ or find me on Twitter @txhackertracker

要查看或添加评论,请登录

Anthony Morris的更多文章

社区洞察

其他会员也浏览了