Logging is crucial in software development, so it’s vital to get it right. That’s easier said than done, unfortunately. Application logging—Python logging included—is riddled with traps you might fall prey to if you aren’t careful.
But you can’t beware what you don’t know, and that’s where this post comes in handy. We’re here to warn you about eight of the most common logging pitfalls in Python. You’ll learn about each problem and techniques and tools you can use to avoid it.
Using a Primitive Logging Approach
The first pitfall on our list consists of using a too basic or simplistic approach to logging. In Python, this amounts to writing log messages using print. While probably enough when the application is performing well, it doesn’t work for detailed debugging and troubleshooting, let alone production use.
Escaping this trap in Python is easy. Unlike many languages that require the use of third-party libraries, the standard logging module Python provides is more than capable of handling sophisticated logging needs.
With the logging module, you can do things like:
- Log messages according to their severities by using the proper logging levels
- Write to different destinations (e.g., files on disk, a database table, etc.)
- Control the granularity of logs, filtering out less relevant messages (e.g., filtering out debug statements while in production)
Using Python’s logging facilities is easy. We can get from zero to logging in just a few lines:
import logging #importing the logging module logging.basicConfig() # performing the default basic configuration log = logging.getLogger("test-logger") # obtaining a logger object log.info("My first log message in Python") # attempting to log at the INFO level log.warn("My second log message in Python")# attempting to log at the WARNING level
If you run the code above, you’ll see only the second message logged to the console because WARNING is the default logging level for the logging module unless you explicitly configure it otherwise. Since the INFO level is lower than WARNING, the first message doesn’t get logged. We can easily fix the problem with the following line:
Poorly Formatted Log Entries
If you want to make the most of your log entries, they need to be both human-readable and machine-parsable. Log entries with poor formatting pose a serious challenge to this twofold goal. But how can you recognize a well-formatted log entry?
The metadata associated with the log message is as crucial as the log message itself. Since it’s not much use knowing an event happened but not when it happened, the first essential detail you want to include in your log entries is a timestamp, either in UTC or in your local time, plus the offset from UTC. While you’re at it, make it all use the ISO-8601 format, to spare everyone from the fact that we humans can’t make up our minds on how to represent time.
Also crucial are details to enable the searching and filtering of log entries. Logging levels are one first example; besides allowing filtering and searching, you can use them to control the granularity of logging, filtering out noise from the signal. Don’t forget to make the level all caps, so it stands out from the other elements in the entry and the overall readability will be improved.
Messy Log Configuration
In the example we provided earlier, our logger was ready to go in just three lines of code. However, a more realistic scenario would likely require further configuration. Depending on the options you need to tinker with, the settings of your logger could grow unruly fast.
As soon as you have a relatively complex application, the best course of action is to move your logging configuration outside the code, and a JSON file is an excellent choice for that. Besides making the code cleaner, you can alter how your logging strategy works without having to edit Python source code, making this approach way more flexible.
To make things better, you don’t have to install third-party tools to deal with JSON when working in Python. First, import the json module. Then, create your logging configuration file—you can see an example here.
After that, it’s just a matter of loading the configuration:
with open('path/to/file.json', "r", encoding="utf-8") as fd: config = json.load(fd) logging.config.dictConfig(config["logging"])
Entries That Lack Context
Log entries are often a helpful aid in troubleshooting. They offer developers a window into the past, so they can reconstruct how the application behaved, understand the user’s actions, and hopefully diagnose and fix whatever is wrong.
However, not all log entries are equally helpful for troubleshooting. If an entry doesn’t contain enough information, the reader will struggle to put its information in perspective.
So, the advice here is simple: avoid generic log messages. Instead, put as much context as possible into your logs—with the caveat of not including PII or other user data that could get you in trouble with legislation such as GDPR.
An efficient way of creating context-rich log entries is to employ structured logging. Structured logging records information in a format resembling objects or key-value pairs. That makes the log entries easily parsable by machines in a way that doesn’t require regular expressions while still being easy to read by humans.
Error Logs With Insufficient Detail
This is sort of a sequel to the last entry, as it has everything to do with adding context to your log entries.
As already mentioned, generic log messages aren’t of much value. This is particularly true for logs that capture errors. In situations where an exception is raised, you don’t just want to know a specific error occurred. Instead, it’s valuable to have the complete traceback of the exception, so you can follow the problem down the hole until the root cause is identified.
To log an exception’s traceback, use the exception method inside your except block, like in the example below:
try: invalid_function() # we're calling a function that doesn't exist except: logging.exception("Houston, we've got a problem!")
Running the example above should result in an output like the following:
ERROR:root:Houston, we've got a problem! Traceback (most recent call last): File ".\log-demo.py", line 4, in <module> invalid_function() NameError: name 'invalid_function' is not defined
It prints the message we’ve supplied to the exception function along with the traceback for the exception. The ERROR level is used by default when you call the exception function. If you want to use another level instead, you can use the regular logging functions, plus the exc_info parameter as true:
log.critical("Houston, we've got a problem!", exc_info=True)
Unnecessary Calls to Expensive Functions
Logging can and does impact performance. One of the ways in which this can happen is when you perform expensive function calls that could be avoided. Consider the following line:
logger.info('The result is %d', super_expensive_calculation())
Now imagine the configured logging level for this application is WARNING. This means the message won’t get written, but the super_expensive_calculation() function will be called nonetheless.
We can easily avoid this issue by first checking whether the configured logging level allows our message to get through:
if logger.isEnabledFor(logging.INFO): logger.info('The result is %d', super_expensive_calculation())
This way, we ensure the expensive function will be called only when necessary.
Logging in the Hot Path
This item is also about performance and could be summarized in a single sentence: avoid logging in the hot path.
“Hot path” here means a portion of the code that’s critical and executed very often. Unless it’s indispensable, you want to avoid logging here because it could cause an I/O bottleneck.
The “Not Invented Here” Syndrome
Last but not least, we have the NIH syndrome, which is a refusal to adopt things you didn’t create yourself. In software, this usually consists of solving problems that already have a solution, aka reinventing the wheel.
So, this item is mainly a consolidation of a recurring theme throughout the list: don’t be afraid of using existing resources. Creating everything from scratch is impractical and not what your employer hired you to do.
Speaking specifically about logging, don’t roll out your homegrown logging solution. Instead, stick to the state-of-the-art solution—which, in Python’s case, is the standard logging module we’ve already mentioned.
Additionally, you can go a step further and use even more sophisticated tools to make your logging strategy more efficient. For instance, you can use a hosted cloud service to streamline your log management process and facilitate troubleshooting.
This post is all about Python, and SolarWinds® Papertrail™ is a solid tool of choice for this language. Among other benefits, Papertrail offers super quick setup and a fast search and tail feature. Even better, the solution is easy to use, so your developers won’t struggle with a steep learning curve. Give Papertrail a try today.
Stand on the Shoulders of Giants for Better Python Logging
We’ve covered eight of the most common hurdles you need to be aware of when logging in Python. One recurring theme through the list is the old “don’t reinvent the wheel” advice. Logging is a long-documented problem solved by many people.
And many of these people have put their solutions at your disposal. They are the many libraries, frameworks, and other tools you can use to make your work easier. Stand on the shoulders of giants by leveraging the resources at your disposal, leaving you free to focus on the problems that matter the most for you and your organization.
This post was written by Carlos Schults. Carlos is a .NET software developer with experience in both desktop and web development, and he’s now trying his hand at mobile. He has a passion for writing clean and concise code, and he’s interested in practices that help you improve app health, such as code review, automated testing, and continuous build.