Make Your Logs Work for You

The days of logging in to servers and manually viewing log files are over. SolarWinds® Papertrail™ aggregates logs from applications, devices, and platforms to a central location.

View Technology Info

FEATURED TECHNOLOGY

Troubleshoot Fast and Enjoy It

SolarWinds® Papertrail™ provides cloud-based log management that seamlessly aggregates logs from applications, servers, network devices, services, platforms, and much more.

View Capabilities Info

FEATURED CAPABILITIES

Aggregate and Search Any Log

SolarWinds® Papertrail™ provides lightning-fast search, live tail, flexible system groups, team-wide access, and integration with popular communications platforms like PagerDuty and Slack to help you quickly track down customer problems, debug app requests, or troubleshoot slow database queries.

View Languages Info

FEATURED LANGUAGES

TBD - APM Integration Title

TBD - APM Integration Description

TBD Link

APM Integration Feature List

TBD - Built for Collaboration Title

TBD - Built for Collaboration Description

TBD Link

Built for Collaboration Feature List

Blog > Is Distributed Tracing Really a Big Deal ?

Is Distributed Tracing Really a Big Deal ?

Posted by By Papertrail Team on May 24, 2021

Microservice architectures are everywhere these days. Even internal enterprise applications—which have typically been structured as self-contained monoliths—are now being designed using a microservices architecture.

There are definite advantages to a microservices architecture. Breaking an application into discrete, independent chunks—basically mini apps—gives you enormous flexibility. But this flexibility dramatically increases complexity, especially when things go wrong.

“Something Went Wrong” System Errors

In a traditional application, if there’s an exception, you can enable debug logging, recreate the error condition, and—using a nifty tool like SolarWinds® Papertrailtail and search events until you identify the cause of the issue.

With applications built on microservices, a single request can traverse multiple independent microservices, each consisting of numerous independent and ephemeral services. This means you lose end-to-end visibility over the request. Your users may see a “something went wrong” message, and you may have no idea which service failed and even less understanding of why.

To troubleshoot errors like these in a microservices application, you need to be able to correlate multiple isolated events across different services as part of a single transaction. This is where distributed tracing comes in.

Enter Distributed Tracing

Distributed tracing lets you track a single transaction—a user generated request, for example—as it traverses through the numerous interdependent services making up the application. Distributed tracing lets you walk a request from start to end and spot failures. But instead of tracking the path within a single application, distributed tracing follows the request across the interconnected services.

For example, say a customer makes a request to book a hotel room on a website, and the application updates the booking information and calls the payment application. Since the request spans multiple resources, distributed tracing follows the request across all the services, APIs, and resources with which it interacts.

Focus on Important Events

To see how distributed tracing can help, let’s look at the annoying “something went wrong” message in the hotel booking application I mentioned earlier. Let’s say the call to support came in around 8 p.m. on a Sunday. When I open the Papertrail event viewer and look at the events around 8pm, I can see an increase in log volume.

Papertrail log velocity view indicates an increase in  event volume at 8PM
Papertrail shows a spike in event volume at 8pm

But when I load the events in this time frame, I still have more than 2,000 event messages to crawl through. With nothing more to go on than the error message and time stamp, it’s difficult to identify the cause of the problem. This is where distributed tracing comes in handy. With a trace ID, Papertrail can zoom in on all the events in a specific transaction, simplifying troubleshooting.

Trace ID provides a method to identify events associated with a transaction.
Trace ID filters events to a specific transaction

To make this work, the system generates a unique identifier when the request is received, and this ID is passed to each service. You can instrument this manually, but an easier way is to look at an application performance management (APM) tool like SolarWinds AppOptics.

AppOptics libraries use auto-instrumentation and context propagation through HTTP headers. These headers include the trace ID, which is passed through to the logs. This allows you to use them to isolate events related to a request in Papertrail.

A Different Approach to Troubleshooting

With an APM tool like SolarWinds AppOptics, you can take most of the guesswork out of troubleshooting and start focusing on relevant events. In our example, instead of immediately opening Papertrail and combing through screens of logs and matching time stamps, you can first open AppOptics and visually explore response time, performance, and errors around a specific time.

AppOptics service view displays response time, request volume and error rates
Service view in AppOptics displays error rates

If you see an error like the one in the screen above, you can drill into a trace for more details. The trace view provides a visual breakdown of a trace and shows you the duration of each span.

The trace view in AppOptics provides the end to end duration, as well as breaking down the request into spans.
Trace view visually breaks down the request into spans

The root cause summary AppOptics provides is also much more meaningful than “something went wrong.”

Root cause summary indicates both the source and type of error.
Root cause summary simplifies troubleshooting

Follow the Trace to the Logs

If you integrate AppOptics with Papertrail, you can jump from the trace view into the logs by selecting Search Logs. The ao.traceID is embedded in all the log lines created from any of the services involved in the transaction. With one click, the Papertrail event viewer launches and a filter matching the trace ID is applied.

Filtering using the Trace ID allows you to focus on event messages related to the request.
Events filtered by Trace ID in Papertrail event viewer

You can see all the logs associated with the request, even as it crosses multiple services. You see the events leading up to an error and can put the error in context.

Wrapping Up

Pairing an APM tool like AppOptics with Papertrail can simplify troubleshooting applications built on microservices by allowing you to track a single request as it traverses through multiple services. You can also use time series metrics, distributed traces, and the associated logs to pinpoint the root cause of performance problems.

Additionally, AppOptics offers performance monitoring for infrastructure (servers, virtual hosts, and containers) and applications and the ability to define custom metrics if the 150+ out-of-the-box integrations aren’t enough. If you’re curious about the tool, you can test drive AppOptics with a free trial and see how much easier troubleshooting can be.