A Short Intro to Logging in the Cloud

A Short Intro to Logging in the Cloud

?Logging is the systematic recording of events in an IT environment. It is the foundation for proactively identifying issues or analyzing what goes wrong, operation-wise and security-wise. Thus, systematic logging is essential for IT security. Suppose IT security teams are blind to components or specific events. In that case, they can neither detect ongoing attacks nor fully understand all the harm and manipulations caused by malware within their IT environments.

Unlike metrics, which provide aggregated values (e.g., server utilization) for quick decision-making, logs offer detailed records of individual events. For instance, a metric might count the number of failed logins, whereas a log documents the five failed login events coming from suspicious foreign countries as well as the thousands of daily authentication events.

Different teams need logs for various purposes:

  • Application Teams want to detect performance issues before they escalate and need logs for troubleshooting, e.g., to analyze application crashes or malfunctions.
  • Infrastructure and middleware teams rely on logs to monitor system performance, especially to assist application teams with troubleshooting unusual middleware issues.
  • Audit & Compliance Logs. Many enterprises must track system changes, code modifications, data access, and financial transactions (e.g., determining who approved a payment). More or less all teams must record such data in their logs to prevent non-compliance findings by internal or external auditors.

From a security perspective, logs save two purposes:

  1. Threat Detection – Identifying ongoing attacks, such as port scans or spikes in failed login attempts.
  2. Incident Response & Forensics – Investigating successful attacks to contain them and clean up the damage comprehensively or support criminal investigations.

These two purposes have different implications for log logistics. Threat detection should take place in (near-) real-time. The relevant logs are the raw data that must be sent to Security Information and Event Management (SIEM) and Security Orchestration, Automation, and Response (SOAR) tools, which analyze them and might respond automatically or raise an alarm to the security operations center (SOC).

SIEM costs often relate directly to the stored and processed data. Thus, pumping all log data into a SIEM is not an option, even though the more data a SIEM has, the more effective security analysts can work on threats. So, while security organizations must ensure all logs for threat detection and a first analysis are in the SIEM, they must also ensure that all potentially relevant log data needed for incident investigations are kept in a secure location, be it locally with the services and components producing the logs or consolidated log storage. It is crucial to prevent duplication of log data due to the massive amounts of data and cost implications. Other standard practices to reduce the amount of data in an SIEM are having shorter retention periods.

From a security perspective, the following log types are especially relevant:

  • ·Authentication Logs track failed and successful login attempts to detect brute-force attacks and compromised accounts. The logs should contain activities related to regular employee accounts, privileged admin accounts, and external users (e.g., customers and partners). In addition, logging the creation of highly privileged accounts can help detect attackers attempting to establish persistent access and backdoors.
  • Network Traffic Logs monitor connections and traffic between systems and network components and track cross-region traffic in the cloud. Logs from components at the network perimeter, such as firewalls and proxies, are particularly important.
  • System and Resource Logs record events on virtual machines, Kubernetes clusters, and cloud services and document access to them.
  • Platform Logs related to the management and organization of the cloud platforms.


Figure 1: Log Landscape for Security-related Events

For (resource) logs, understanding the differences between control plane and data plane logs is vital:

  • Control Plane Logs –capture events related to the creation or modification of resources (e.g., provisioning a new database). These events occur through the cloud provider's management APIs.
  • Data Plane Logs capture access and interaction with the customer's actual cloud resource (e.g., a query against a database).

It is essential to know which data is irrelevant for threat detection and which data must go to the SIEM for in-depth analysis. Deciding which data is not relevant is also essential, especially with respect to data plane logs. When every click on a button in a banking app writes 1KB of log data, costs just for log storage explode.

So, logging is the cornerstone of cybersecurity, enabling organizations to detect threats and respond effectively to security incidents. While cloud platforms offer comprehensive logging capabilities, organizations must decide which logs they activate, where to store the log data, and which log data has to be forwarded to the company’s SIEM. Balancing visibility for security and costs is central to any effective cloud logging strategy.

要查看或添加评论,请登录

Klaus Haller的更多文章