OpenTelemetry Tracing in .NET
Ittahad Uz Zaman Akash
Senior Software Engineer @ BRAC IT | Ex-Lead at SELISE Signature
OpenTelemetry is a project managed by CNCF. Before OpenTelemetry there stood two projects: OpenTracing and OpenCensus. Both projects were merged. OpenTelemetry now offers a set of layers focusing on observability
W3C Trace Context specification is a set of new standards developed by open source and commercial tool providers that defines a unified approach to the context and event correlation within distributed systems, such as microservices environments.
In this article, I will demonstrate the process of implementing OpenTelemetry in your .NET application and discuss a real-life scenario.
As the now-distributed systems scaled, it became increasingly difficult for developers to see how their own services depend on or affect other services, especially after a deployment or during an outage, where speed and accuracy are critical.
So, it is quite apparent that in a distributed system, where each and every component of a microservices ecosystem communicate independently, it's quite difficult to find the cause of a failure and track an issue. OpenTelemetry has tracing capabilities that could help us in such scenarios. Moreover, its metrics and logging combined with the tracing functionality can help the developers to get a clear picture of any distributed system during any outage or critical issues.
Let's start with deploying our observability back-end. There are some open-source tools available such as Jaeger and Zipkin. We will consider Zipkin in this article. Why need the observability back-end? Because our application will emit some data points containing traces, spans, etc. And the observability back-ends can process and visualize the data in more meaningful ways.
docker run -d -p 9411:9411 openzipkin/zipkin-slim
This command will pull the Zipkin docker image and run the application on the 9411 port.
Let's configure the Zipkin instance in our .NET application codebase
领英推荐
serviceCollection.AddOpenTelemetryTracing(tracerProviderBuilder =>
? ? ? ? ? ? {
? ? ? ? ? ? ? ? tracerProviderBuilder
? ? ? ? ? ? ? ? ? ? .SetSampler(new AlwaysOnSampler())
? ? ? ? ? ? ? ? ? ? .AddHttpClientInstrumentation()
? ? ? ? ? ? ? ? ? ? .AddAspNetCoreInstrumentation()
? ? ? ? ? ? ? ? ? ? .AddMassTransitInstrumentation();
? ? ? ? ? ? ? ? string source = isConsumer ? DiagnosticHeaders.DefaultListenerName : serviceName;
?
? ? ? ? ? ? ? tracerProviderBuilder
? ? ? ? ? ? ? ? ? ? .AddSource(source)
? ? ? ? ? ? ? ? ? ? .SetResourceBuilder(ResourceBuilder.CreateDefault().AddService(serviceName: serviceName))
? ? ? ? ? ? ? ? ? ? .AddZipkinExporter(o =>
? ? ? ? ? ? ? ? ? ? {
? ? ? ? ? ? ? ? ? ? ? ? o.Endpoint = new Uri("https://localhost:9411/api/v2/spans");
? ? ? ? ? ? ? ? ? ? });
? ? ? ? ? ? });
Now let's talk about a more challenging task. We are working with microservices and each of the services is listening to their own queues. I am using MassTransit with RabbitMq to establish communication among the services. Naturally, the processes are separate and could be hosted on different CSP infra, So, how do we ensure cross-platform observability and tracing? Here, OpenTelemetry nugets come in handy.
Here you may see that OpenTelemetry has support for MassTransit and HTTP. This means that your microservices could either be working by communicating in HTTP or using MassTransit, observability traces will flow regardless. Note that, MassTransit supports RabbitMq, SQS, Kafka etc. So, Whichever technology your application is built upon or whatever medium is being used during the communication you could have end-to-end traceability.
For RabbitMq with MassTransit and MediatR, you have to pass the Service-A traceId, and spanId to Service-B consumer. Here I have added the attributes within the base message and passed them along to the consumer.
using var activity = Activity.Current;
var baseMessage = command as MinimalCommand;
baseMessage.SpanId = activity.SpanId.ToString();
baseMessage.TraceId = activity.TraceId.ToString();
In the consumer, we just have to decode the base message and start an activity with the previous spanId as the parent.
public async Task Consume(ConsumeContext<TMessage> context)
{
? ? ?var activity = TryAddingObservabilityTrace(context);
? ? ?activity?.Start();
? ? ?_ = await Handle(context.Message);
? ? ?activity?.Stop();
}
private Activity? TryAddingObservabilityTrace(ConsumeContext<TMessage> context)
{
? ? ?try
? ? ?{
? ? ? ? ? var baseMessage = context.Message as MinimalCommand;
? ? ? ? ? if (baseMessage == null) return null;
? ? ? ? ? var traceId = ActivityTraceId.CreateFromString(baseMessage.TraceId.AsSpan());
? ? ? ? ? var spanId = ActivitySpanId.CreateFromString(baseMessage.SpanId.AsSpan());
? ? ? ? ? ActivityContext activityContext = new ActivityContext(
? ? ? ? ? ? ? ? ? ? traceId: traceId,
? ? ? ? ? ? ? ? ? ? spanId: spanId,
? ? ? ? ? ? ? ? ? ? traceFlags: ActivityTraceFlags.Recorded);
? ? ? ? ? var activity = TracingProvider.MyActivitySource.StartActivity(
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? $"{context.Message.GetType().Name}-Handler",
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ActivityKind.Consumer,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? activityContext);
? ? ? ? ? return activity;
? ? ?}
? ? ?catch (Exception) {?
? ? ? ? ? return null;
? ? ?}
}
That will just work out. With this configuration, we can have distributed traces enabled in our microservices ecosystem.
Here is an image of a such scenario. You may see that I have 3 different applications running. With just a single glance I can notice that a GET HTTP request failed within TestingHost -- TestMessage2Handler resource. And I can
inspect what was the exact reason for the failure. So, we can see that an HTTP request to the Test2/TestPing resource failed within the TestingHost service. Conventionally, we would have to go through several log files to find this specific error to come to any conclusion. But, here we just had the search with the request traceId and the complete lifecycle of the request was visible to me. That's the beauty of using tracing in our application.
For the complete codebase follow this link or feel free to contact me.
Senior Software Engineer | C#, .NET Core, Microservice | Cloud | Always Learning, Always Positive
1 年good job