ç™»å½•æŸ¥çœ‹æ›´å¤šå†…å®¹

Containerized Cloud Logging

Vinod Gupta

Sr Technologist | Azure, MicroServices, Event Sourcing & CQRS | .NET Core, C#, Java

å‘å¸ƒæ—¥æœŸ: 2018å¹´12æœˆ7æ—¥

1. Overview

As with any service, logging is a core component of Docker. Analyzing logs provides insight into the performance, stability, and reliability of containers and the Docker service itself. However, because of the flexible and dynamic nature of Docker, thereâ€™s no single approach to gathering and storing log events. Instead, we have a variety of solutions at our disposal, each with its own benefits and drawbacks

2. Questions

How can we log and monitor Docker effectively? This includes logging the Docker runtime infrastructure, the container itself and what goes on inside of it, and how to ensure to collect log data from ephemeral containers.
How can we use feedback from containers to manage and improve the quality of our services?
Can we build from decades of experience logging monolithic applications, or do we have to start from scratch?
If we must start from scratch, how can we build a solution that helps us make better decisions?

3. Traditional vs Centralized Logging

With traditional logging methods, we would choose from a variety of logging frameworks and then define a logging strategy that logs each container (or service) independently of other containers and of the overall logging strategy

Alternatively, we can configure our containers to forward their logs directly to a central logging service. Each container still needs a way to generate logs, but the logging service is responsible for processing, storing, or sending logs to a centralized logging service such as Loggly

4. Key Considerations When Logging in Docker

Although there are some similarities, container-based logging is still very different from traditional application-based logging. Below are a few things needed to keep in mind.

4.1 Containers Are Transient

Containers come and go. They start, they stop, theyâ€™re destroyed, and theyâ€™re rebuilt on a regular basis. Storing persistent application data inside of a container is an anti-pattern with Docker, since the data will be lost once the process completes. While containers can store persistent data using volumes, the recommended solution is to export data (logs or otherwise) to a service that can store it long-term, whether itâ€™s a folder on the local hard drive or an Azure File Storage, Azure Blob Storage or Amazon S3 bucket. This way, you can stop and start your containers without compromising your data

4.2 Containers Are Multi-Tiered

Docker logging isnâ€™t as simple as configuring a framework and running the container. Even the simplest Docker installation has at least three distinct levels of logging: the Docker container, the Docker service, and the host operating system (OS). As the infrastructure becomes more complex and more containers are deployed, a way needed for associating log events with specific processes rather than just their host containers. We need to define custom tags for each log event as it passes through the container and later correlate those events to lookup specific log

4.3 Containers Are Complex

Docker is robust enough for many enterprises, but there are lingering security issues that have yet to be resolved. Compared to virtual machines, containers pose a much larger attack vector since they share the same kernel as the host. Some enterprises have worked around this by running Docker containers in a virtual environment while, some have taken the opposite approach by running virtual machines inside of Docker containers. Known as VM containers, these containers run just like normal containers except they host a complete Kernel-based virtual machine (or KVM) environment. This merges the flexibility of Docker containers with the security of virtual machines

Unfortunately, both approaches come with drastic increases in logging complexity. Not only do we have to log the application, the Docker daemon, and host OS, but we also must log the virtual machine and hypervisor. Logging, tagging, and associating all these services is not just a feat of architectural engineering, but requires a comprehensive solution development. It is important, though, to have the right logging strategy in place for the taken respective approach. Missing out on the opportunity to collect and aggregate the logs of one specific tier might prevent us from efficiently troubleshooting issues

5. Methods of Logging in Docker

Like virtualization, containers add an extra layer between an application and the host OS. Logging Docker effectively means not only logging the application and the host OS, but also the Docker service

5.1 Logging via the Application

This process is likely what most developers are familiar with. In this process, the application running inside the container handles its own logging using a logging framework. The application format and send logs to a remote destination. This acts as an easy and intuitive migration path for enterprises using a logging framework in their existing applications. The logs are sent from the application to a remote centralized server bypassing Docker and the OS. This gives developers the most control but also adds additional load on the application process

5.2 Logging via Data Volumes

When dealing with Docker logs, there is one important caveat we must keep in mind always. Because containers are stateless by nature, any files created within the container will be lost if the container shuts down. Instead, containers must either forward log events to a centralized logging service (such as Loggly or store log events in a data volume).

With a data volume, we can store long-term data in our containers by mapping a directory in the container to a directory on the host machine. We can also share a single volume across multiple containers to centralize logging across multiple services. However, data volumes make it difficult to move these containers to different hosts without potentially losing log data.

5.2.1 When Should I Log via Data Volumes?

Data volumes are effective for centralizing and storing logs over an extended period. Because they link to a directory on the host machine, data volumes significantly reduce the chances of data loss due to a failed container. Because the data is now available to the host machine, we can make copies, perform backups, or even access the logs from other containers

5.3 Logging via the Docker Logging Driver

a. One option is to forward log events from each container to the Docker service, which then sends the events to a syslog instance running on the host.

Note: With Loggly in place, we accomplish this by changing the Docker logging driver to log to syslog and then use the Configure-Syslog script to forward the events to Loggly.

b. Another option is to have the application forward its logs to a container dedicated solely to logging. That container, rather than the host OS, becomes responsible for forwarding each event to the right destination.

5.3.1 When Should I Log via the Docker Logging Driver?

Unlike data volumes, the Docker logging driver reads log events directly from the containerâ€™s stdout and stderr output. This lets us quickly and effectively centralize our container logs by using just the Docker service. The benefit is that our containers will no longer need to write to and read from log files, resulting in a performance gain. Additionally, since log events are stored in the host machineâ€™s syslog, they can be easily routed to Loggly

5.4 Logging via a Dedicated Logging Container

While the two previous methods have several advantages, they share a common disadvantage: They rely on a service running on the host machine. Dedicated logging containers, on the other hand, let you manage logging from within the Docker environment. Dedicated logging containers can retrieve log events from other containers, aggregate them, then store or forward the events to a third-party service. This approach is more aligned with the microservices architecture since it eliminates your containersâ€™ dependencies on the host machine without hindering your logging capabilities.

Dedicated logging containers can manage logs for specific containers, or they can act as a â€œlog vacuumâ€ for multiple containers. For example

a. Option one - Logspout container automatically captures stdout output from any containers running on the same host and forwards them to a remote syslog service.

b. Option two - we can use Joseph Feeneyâ€™sLogspout-Loggly container to send events from other containers directly to Loggly

5.4.1 When Should I Use a Dedicated Logging Container?

In addition to centralizing and aggregating logs, dedicated logging containers eliminate any dependencies on the host machine. Not only does this make it easier to move containers between hosts, but it lets us scale our logging infrastructure as needed by adding additional containers. Dedicated logging containers can retrieve logs through multiple streams (data volumes, stdout, etc.), making them at least as flexible as host-based logging solutions.

5.5 Logging via the Sidecar Approach

In sidecar, each container is linked with its own logging container. The first (or application) container saves its logs to a volume that can be accessed by the logging container. The second (or logging) container then uses file monitoring to tag and forward each event to Logging Service. An example of this approach is the Loggly Docker container. Although like dedicated logging containers, sidecar containers can offer greater transparency into the origin of log events

5.5.1 When Should I Use the Sidecar Approach?

As with dedicated logging, the key benefit of the sidecar approach is that it lets you manage logging the same way you manage your applications. Sidecar containers scale more easily than other logging methods, making them ideal for larger deployments. This approach also lets you incorporate additional tracking information specific to the logging container into each log event. By providing custom tags, we can more easily track where log events originate and which containers are actively generating logs.

The downside to this approach is that it can be complex and more difficult to set up. Both containers must work in tandem or you may end up with incomplete or missing log data. In this case, it might be easier to use a tool such as Docker Compose to manage both containers as a single unit

6. Log Contents

The Docker daemon logs two types of events:

Commands sent to the daemon through Dockerâ€™s Remote API

Events that occur as part of the daemonâ€™s normal operation

6.1 Remote API Events

The Remote API lets you interact with the daemon using common commands. Commands passed to the Remote API are automatically logged along with any warning or error messages resulting from those commands. Each event contains:

The current timestamp
The log level (Info, Warning, Error, etc.)
The request type (GET, PUT, POST, etc.)
The Remote API version
The endpoint (containers, images, data volumes, etc.)
Details about the request, including the return type

6.2 Daemon Events

Daemon events are messages regarding the state of the Docker service itself. Each event displays:

The current timestamp
The log level
Details about the event

Actions performed during the initialization process
Features provided by the host kernel
The status of commands sent to containers
The overall state of the Docker service
The state of active containers

7. Proposed Design

One of the two options can be implemented

7.1 Live Logging

In this option the service writes log directly to the Azure file storage using asynchronous pattern.

7.2 Differed Logging

In this option the service writes local log inside the container and another service (Log Service) monitors completion of the log file and uploads to the Azure file storage

NOTE: If we want to use Loggly then Logging service can be replaced by Loggly else if we want to use Azure Log Analytics then Logging Service will be replaced by Azure Log Analytics service. A proof of concept is required for both approaches to arrive at the appropriate solution.

8. References:

1. Andre Newman's article https://www.loggly.com/blog/top-5-docker-logging-methods-to-fit-your-container-deployment-strategy/

2. https://docs.docker.com/storage/volumes/

3. https://www.loggly.com/docs/about-loggly/

4. https://github.com/iamatypeofwalrus/logspout-loggly

5. https://www.loggly.com/blog/how-to-implement-logging-in-docker-with-a-sidecar-approach/

6. https://www.loggly.com/blog/what-does-the-docker-daemon-log-contain/

7. https://github.com/iamatypeofwalrus/logspout-loggly

8. https://docs.docker.com/v1.10/engine/userguide/containers/dockervolumes/

9. https://dzone.com/articles/containers-5-docker-logging-best-practices

10. https://www.monitis.com/blog/containers-5-docker-logging-best-practices/

11. https://kubernetes.io/docs/concepts/cluster-administration/logging/

12. https://docs.microsoft.com/en-us/azure/log-analytics/log-analytics-overview

Deepak Mishra

Technical Architect

6 å¹´

Very insightful information about docker logging.

èµž

å›žå¤

1 æ¬¡å›žåº”

è¦æŸ¥çœ‹æˆ–æ·»åŠ è¯„è®ºï¼Œè¯·ç™»å½•

Vinod Guptaçš„æ›´å¤šæ–‡ç«

gRPC - vehicle for a real-time full-duplex communication

2021å¹´12æœˆ30æ—¥

gRPC - vehicle for a real-time full-duplex communication

Have you ever worked on some kind of web based chat applications where the requirement is to manage the real-timeâ€¦
GraphQL - provides exactly what consumers need and nothing more

2021å¹´12æœˆ30æ—¥

GraphQL - provides exactly what consumers need and nothing more

Some of you already know that GrapQL is the now preferred technology for communication over http betweenâ€¦
My 2 cents for Event Sourcing

2021å¹´12æœˆ30æ—¥

My 2 cents for Event Sourcing

Can you go back in history and change it (of course not in reel life)? No one of us have that superpower?? Since manyâ€¦
Azure - Service Bus, Event Hub, Application Insights & Log Analytics

2019å¹´2æœˆ25æ—¥

Azure - Service Bus, Event Hub, Application Insights & Log Analytics

Event Hub, Service Bus and Application Insights are part of Azure Service Bus Namespace in Azure portal and part ofâ€¦
Azure CosmosDB Implementation

2019å¹´2æœˆ23æ—¥

Azure CosmosDB Implementation

Objective Design planet scale Azure Cosmos DB for better throughput, latency and scalability. Goals Defining theâ€¦

See all articles

Containerized Cloud Logging

Vinod Gupta

Sr Technologist | Azure, MicroServices, Event Sourcing & CQRS | .NET Core, C#, Java

1. Overview

2. Questions

3. Traditional vs Centralized Logging

4. Key Considerations When Logging in Docker

4.1 Containers Are Transient

4.2 Containers Are Multi-Tiered

4.3 Containers Are Complex

5. Methods of Logging in Docker

5.1 Logging via the Application

5.2 Logging via Data Volumes

5.2.1 When Should I Log via Data Volumes?

5.3 Logging via the Docker Logging Driver

5.3.1 When Should I Log via the Docker Logging Driver?

5.4 Logging via a Dedicated Logging Container

5.4.1 When Should I Use a Dedicated Logging Container?

5.5 Logging via the Sidecar Approach

5.5.1 When Should I Use the Sidecar Approach?

6. Log Contents

6.1 Remote API Events

6.2 Daemon Events

7. Proposed Design

7.1 Live Logging

7.2 Differed Logging

8. References:

Vinod Guptaçš„æ›´å¤šæ–‡ç«

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

Deep Dive into Cloud Cost Optimization: Engineering Strategies for AWS, Azure, GCP, IBM Cloud, OCP, OCI, Alibaba Cloud, etc.

Embrace the Future: AWS Cloud Course 2024 Unveiled

K0s Vs. K3s Vs. K8s: The Differences And Use Cases

Google Cloud Transforming Industries with Innovation, Scalability, and Advanced Cloud Solutions

Kubernetes on Prem.

CloudifyOps Mini-blog series - Cost-Effective Cloud: Strategies for AWS Optimization using Komiser

Introduction to Azure Cloud Architecture

Azure with Dynatrace: Transforming Cloud Visibility with AI-Driven Observability

Cloud Architecting with GCP: A Learning Journey?â€”?Part3

Mastering AWS Monitoring with Dynatrace

1. Overview

2. Questions

3. Traditional vs Centralized Logging

4. Key Considerations When Logging in Docker

4.1 Containers Are Transient

4.2 Containers Are Multi-Tiered

4.3 Containers Are Complex

5. Methods of Logging in Docker

5.1 Logging via the Application

5.2 Logging via Data Volumes

5.2.1 When Should I Log via Data Volumes?

5.3 Logging via the Docker Logging Driver

5.3.1 When Should I Log via the Docker Logging Driver?

5.4 Logging via a Dedicated Logging Container

5.4.1 When Should I Use a Dedicated Logging Container?

5.5 Logging via the Sidecar Approach

5.5.1 When Should I Use the Sidecar Approach?

6. Log Contents

6.1 Remote API Events

6.2 Daemon Events

7. Proposed Design

7.1 Live Logging

7.2 Differed Logging

8. References:

Vinod Guptaçš„æ›´å¤šæ–‡ç«

gRPC - vehicle for a real-time full-duplex communication

GraphQL - provides exactly what consumers need and nothing more

My 2 cents for Event Sourcing

Azure - Service Bus, Event Hub, Application Insights & Log Analytics

Azure CosmosDB Implementation

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

Deep Dive into Cloud Cost Optimization: Engineering Strategies for AWS, Azure, GCP, IBM Cloud, OCP, OCI, Alibaba Cloud, etc.

Embrace the Future: AWS Cloud Course 2024 Unveiled

K0s Vs. K3s Vs. K8s: The Differences And Use Cases

Google Cloud Transforming Industries with Innovation, Scalability, and Advanced Cloud Solutions

Kubernetes on Prem.

CloudifyOps Mini-blog series - Cost-Effective Cloud: Strategies for AWS Optimization using Komiser

Introduction to Azure Cloud Architecture

Azure with Dynatrace: Transforming Cloud Visibility with AI-Driven Observability

Cloud Architecting with GCP: A Learning Journey?â€”?Part3

Mastering AWS Monitoring with Dynatrace

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†