This post is the first in a series on using the logging facilities in Python. Logging is an incredibly powerful tool for developers during early system development, and also during deployment.

One of the interesting features of the logging package is an optimization where rendering of the log message is delayed as long as possible (docs). What’s not obvious is how this optimization will impact execution time.

In order to understand the consequences of how we construct the arguments to a call to a logger, we’re going to run three experiments. The first experiment will explore logging performance with a single logger, the second will use two loggers, and the third will use three loggers.

Logging Configuration

In order to study only the effect of the arguments on logging performance, we use a custom logging.Handler  subclass that performs no IO, but does go through the motions of preparing the text for a log message.

Logging was configured as follows for the single handler experiment:

Experimental Code

The experimental code makes 2 logging calls in exercise_log . The first call, on line 19, passes an expensive_operation  object reference as an argument to the logging call, while the second call, on line 31, invokes the expensive_operation  object, passing the  result to the logging call. In both cases the logging call is made for each of the standard log levels. The exercise_log  function is called for each of the standard log levels. The expensive_operation  object is just a while loop that counts from 0 to expensive_iterations  by 1 when it is invoked or converted to a string.

Tests

The initial test uses one log handler and has the following results:

Logging times for single handler

Level FilterObj RefAvg. TimeMethod CallAvg. Time
CRITICAL10.258651.2218
ERROR20.491451.1988
WARNING30.741651.2148
INFO40.990351.2087
DEBUG51.226351.2093

The first item of note is the Obj Ref column, which is the count of the number of times the object was converted to a string. In keeping with the behavior that message formatting is deferred as long as possible, we see that the string conversion only happened when the message would be rendered under the logging level configuration. When we passed the result of invoking the object, we performed the expensive operation every time, resulting in longer execution times.

 

The second test uses two handlers associated with a single logger and had the following results:

Logging time for two handlers

Level FilterObj RefAvg. TimeMethod CallAvg. Time
CRITICAL20.514751.2737
ERROR40.990151.2715
WARNING61.575451.2777
INFO81.901451.2739
DEBUG102.358051.2027

The third test uses two handlers associated with a single logger and had the following results:

Logging Time for Three Handlers

Level FilterObj RefAvg. TimeMethod CallAvg. Time
CRITICAL30.701551.1775
ERROR61.412751.1779
WARNING92.133051.1690
INFO122.816151.1706
DEBUG153.794351.1806

Looking at the second and third tests we see that the number of times we convert an object to a string increases with the number of handlers. The initial test might have given the impression that object references are better, but we can see that the situation is more complicated.

Discussion

As we can see from the test results, minimizing the amount of time we spend in logging calls is not simple. We need to be mindful of the number of handlers, the logging level, and the cost of preparing the arguments for the logger.  In general, it will be better to have the arguments created once for a logging call, avoiding the need to perform expensive operations for each handler.

The logging cookbook recommends using the  logger.isEnabledFor(level) function to decide if a logging call should be made. This practice is unsatisfying as it encourages the use of expensive functions as arguments to logging calls. A better option would be to refactor your code so that the generation of arguments for logging calls is inexpensive, using the   logger.isEnabledFor(level) method only when inexpensive arguments cannot be used.

An important caveat for logging calls, regardless of how computational expense is managed, is that they should not alter system state. If it is necessary to alter system state via functions that generate input for logging arguments, then those functions should be idempotent.