Structured Logging - A Guide

Structured Logging - A Guide

What is Structured Logging?

Structured logging is the application of a consistent format to logs, allowing them to be treating as data sets which can be easily searched and analysed, giving insights into application behaviour.

Structured vs. Unstructured Logs

Unstructured logs have messages in plain text containing all log data in a single string.

Structured logs record log events in a standardised, machine-readable format with key value pairs, usually in JSON format.

Structured Logging Example

Unstructured Logs in Rails

In the code:

job = ... 
Rails.logger.info("Performed #{job.class} in #{job.duration}")
        

In the logs:

[2022-04-01 14:35.0154] [Sidekiq] Performed Mailer::PasswordReset in 1436.451
        

Most traditional logs look like this. It stores unstructured data. Any log data in plain text can go in the message. It's not easily parsed by machines - we need regular expressions.

?? Unstructured logs are difficult to work with.

Structured Logs in Rails

In the code:

job = ...
Rails.logger.info(
  message: "Performed #{job.class} in #{job.duration}",
  event: {
    name: "app.job.performed",
  },
  code: {
    namespace: job.class,
  },
  duration: job.duration,
  messaging: {
    active_job: {
      adapter: {
        name: job.adapter
      }
    }
  }
)
        

This outputs the following JSON in the logs:

{
  "level": "info",
  "message": "Performed Mailer::PasswordReset in 1436.451",
  "event": {
    "name": "app.job.performed"
  },
  "code": {
    "namespace": "Mailer::PasswordReset",
  },
  "duration": 1436.451,
  "messaging": {
    "active_job": {
      "adapter": {
        "name": "sidekiq"
      }
    }
  }
}
        

The code for structured logging looks more unfamiliar. Note how the log message is separated. We get contextual information. We can detect patterns in our logs. The log data has a consistent format in key value pairs.

?? Structured logs are easy to parse, search and aggregate.

What can I do with Structured Logs vs Unstructured Logs?

Structured logs allow engineers to search and aggregate more easily.

What use cases do structured logs support?

Use Case: Search for exact match of single field

Example: All Mailer::DeliverPasswordReset jobs

Structured Logs Difficulty: Easy

Unstructured Logs Difficulty: Easy

Use Case: Search for substring match of single field

Example: All Mailer::* jobs

Structured Logs Difficulty: Easy

Unstructured Logs Difficulty: Medium

Use Case: Search for exact match of two fields

Example: All jobs performed for Mailer::DeliverPasswordReset jobs

Structured Logs Difficulty: Easy

Unstructured Logs Difficulty: Medium

Use Case: Search for equality conditions

Example: Jobs performed with a duration of over 1 second

Structured Logs Difficulty: Easy

Unstructured Logs Difficulty: Impossible

Use Case: Search for aggregations - average, minimum, maximum, P99, P95, P90

Example: Jobs performed with a P95 over 3 seconds

Structured Logs Difficulty: Medium

Unstructured Logs Difficulty: Impossible

Use Case: Group by a field

Example: Graph duration of jobs grouped by job class

Structured Logs Difficulty: Medium

Unstructured Logs Difficulty: Impossible

Benefits of Structured Logs

Structured logging has allowed me to:

  • Analyse and understand my application behaviour
  • Resolve production incidents quickly
  • Supply critical business intelligence to other departments
  • Aggregate performance data - easily graph percentiles, averages, minimum and maximums
  • Create visuals like graphs inside my observability tool
  • Diagnose problems 10-100 times faster than with plain text log data

Structured Logging Best Practices

Here's what's worked for me over the years:

  • Select a library that supports structured logging and integrates with your web framework
  • Establish a consistent format for logs across all services - OpenTelemetry is valuable for this
  • Use unique identifiers to trace requests across multiple services
  • Convert unstructured logs outside the application into structured logs (apache, nginx, cloud provider)

Structured Logging Libraries

Depending on the tool and framework, here are some logging frameworks that implement structured logging in your software system:

Javascript - winston, pino, bunyan

C# - serilog, nlog

Java - log4j, tinylog

Python - contextual_logger

Ruby - semantic_logger

Implementing Structured Logging

  1. Set a specific goal - example - "why are background jobs failing?"
  2. Initial goal - focus on migrating existing unstructured logs to a structured format
  3. Deal with existing dependencies - do engineers use unstructured logs or are there other systems that depend on this format?
  4. Select a library that supports structured logging
  5. Patch framework to use the structured logging library chosen above
  6. Test logs locally - keep existing format so as not to confuse other engineers
  7. Integrate and test with observability tool in staging environment - are structured logs coming through? Are they being parsed correctly?
  8. Deploy changes to the log data
  9. Use structured logs and list defects - focus on the specific goal - what’s missing or not in the right format?
  10. Improve structured logs and redeploy

Use Cases for Structured Logging

Structured logging is great for debugging, troubleshooting, monitoring application performance, auditing security and compliance requirements, and more.

It is particularly useful in complex distributed systems where the log entry needs to be standardized and correlated.

Structured logs can be used to feed log data into a log management tool that allows for querying, correlating, and visualizing log data across all dimensions.

Working with Structured Log Data

Use a log management tool that allows JSON data to be sent.

For unstructured text data from other systems use log parsing tools to transform into structured data with a standard format.

Establish a unified schema for attribute names and values across all log sources to facilitate seamless correlation and cross-analysis of log data.

Overcoming Common Challenges

One of the main challenges of structured logging is ensuring all application logs are standardized and use the same structured format. For this I'd suggest OpenTelemetry's semantic conventions.

Another challenge is the potential performance overhead due to the additional formatting, but modern logging libraries are highly optimized.

Existing unstructured logs from systems with no control over the logging framework can be difficult to work with. We can reformat into a consistent log format using an ingestion pipeline or custom scripts.

Real-World Example of Structured Logging

We received a Slack alert - "log volume is projected to breach quota".

This was a monitor we'd set up in our observability tool to stop cost overages. A lot of structured data was being sent into our logs. That's strange.

Question - when are the number of structured logs spiking?

Looking into log analysis tool, we could see a huge spike in log levels at 3:17pm.

Question - what's generating those logs?

Grouping the structured logs by event name, we see the majority of these events are job.performed.

Question - what jobs are being performed?

Filtering by event name of job.performed, then grouping by job class we see the culprit - GeoDecodingJob

Question - why are there a surge of GeoDecodingJobs?

Graphing the number of retries of GeoDecodingJobs, we can see that these jobs were repeatedly being retried and all the retries were mounting up

Question - why were the jobs retrying?

The API was down for an hour and so all our jobs were retrying.

Action - tweak exponential backoff algorithm of the background job to be more lazy so the same number of jobs wouldn't mount up in the same way again.

Fixed!

With unstructured log data we wouldn't have been able to answer the first question, let alone the others.

Unlike unstructured logs, any data we've recorded in our instrumentation is available to search, sort, filter and group on.

Conclusion

Structured logging is part of essential best practices for any team that wants to improve their log management and analysis capabilities.

By following best practices and using the right tools, companies can overcome common challenges and reap the benefits of structured logging.

With structured logs, I've gained valuable insights into the performance and health of my application, and I've made data-driven decisions to help the software development process.

要查看或添加评论,请登录

John Gallagher的更多文章

社区洞察

其他会员也浏览了