Adventures in Python Logging – Part 1 Avoiding expensive logging call parameters

This post is the first in a series on using the logging facilities in Python. Logging is an incredibly powerful tool for developers during early system development, and also during deployment.

One of the interesting features of the logging package is an optimization where rendering of the log message is delayed as long as possible (docs). What’s not obvious is how this optimization will impact execution time.

In order to understand the consequences of how we construct the arguments to a call to a logger, we’re going to run three experiments. The first experiment will explore logging performance with a single logger, the second will use two loggers, and the third will use three loggers.

Logging Configuration

In order to study only the effect of the arguments on logging performance, we use a custom logging.Handler  subclass that performs no IO, but does go through the motions of preparing the text for a log message.

Logging was configured as follows for the single handler experiment:

Experimental Code

The experimental code makes 2 logging calls in exercise_log . The first call, on line 19, passes an expensive_operation  object reference as an argument to the logging call, while the second call, on line 31, invokes the expensive_operation  object, passing the  result to the logging call. In both cases the logging call is made for each of the standard log levels. The exercise_log  function is called for each of the standard log levels. The expensive_operation  object is just a while loop that counts from 0 to expensive_iterations  by 1 when it is invoked or converted to a string.

Tests

The initial test uses one log handler and has the following results:

Logging times for single handler

Level FilterObj RefAvg. TimeMethod CallAvg. Time
CRITICAL10.258651.2218
ERROR20.491451.1988
WARNING30.741651.2148
INFO40.990351.2087
DEBUG51.226351.2093

The first item of note is the Obj Ref column, which is the count of the number of times the object was converted to a string. In keeping with the behavior that message formatting is deferred as long as possible, we see that the string conversion only happened when the message would be rendered under the logging level configuration. When we passed the result of invoking the object, we performed the expensive operation every time, resulting in longer execution times.

 

The second test uses two handlers associated with a single logger and had the following results:

Logging time for two handlers

Level FilterObj RefAvg. TimeMethod CallAvg. Time
CRITICAL20.514751.2737
ERROR40.990151.2715
WARNING61.575451.2777
INFO81.901451.2739
DEBUG102.358051.2027

The third test uses two handlers associated with a single logger and had the following results:

Logging Time for Three Handlers

Level FilterObj RefAvg. TimeMethod CallAvg. Time
CRITICAL30.701551.1775
ERROR61.412751.1779
WARNING92.133051.1690
INFO122.816151.1706
DEBUG153.794351.1806

Looking at the second and third tests we see that the number of times we convert an object to a string increases with the number of handlers. The initial test might have given the impression that object references are better, but we can see that the situation is more complicated.

Discussion

As we can see from the test results, minimizing the amount of time we spend in logging calls is not simple. We need to be mindful of the number of handlers, the logging level, and the cost of preparing the arguments for the logger.  In general, it will be better to have the arguments created once for a logging call, avoiding the need to perform expensive operations for each handler.

The logging cookbook recommends using the  logger.isEnabledFor(level) function to decide if a logging call should be made. This practice is unsatisfying as it encourages the use of expensive functions as arguments to logging calls. A better option would be to refactor your code so that the generation of arguments for logging calls is inexpensive, using the   logger.isEnabledFor(level) method only when inexpensive arguments cannot be used.

An important caveat for logging calls, regardless of how computational expense is managed, is that they should not alter system state. If it is necessary to alter system state via functions that generate input for logging arguments, then those functions should be idempotent.

What flow control?

The assert()  macro is not a flow control mechanism (and NULL != 0 ).

An interesting vulnerability in systemd appears to be a consequence of the failure to understand that assert() is not and should not be used as flow control. In the code the authors had the following function:

In line 7 of the snippet there is an assertion to check if n > 0 , in this case n  is the size of  a buffer.  Since a buffer of size 0 causes systemd to crash on some systems, this is a good check. There are however two serious problems with the code as written:

  1. If systemd is compiled with the -DNDEBUG  compiler flag, then the assert()  macro will not generate any code so the check doesn’t happen.
  2. If systemd is compiled without the -DNDEBUG  flag, then if the assertion fails the program will abort.

For a critical software component, both problems are extremely undesirable. The core of the problem appears to be the use of the assert()  macro as a flow control mechanism. If it critically important for the value of n  to be checked, then use a real comparison with explicit error handling.

An additional item of concern is the use of assert()  macros where there is an implicit assumption that   NULL == 0 . At a minimum, comparisons of a pointer to NULL  should be done explicitly.

Assuming that the conditions implied by the assert()  macro usage is correct, the code should be rewritten as:

In keeping with good practice, this code will check important conditions regardless of the presence of the  -DNDEBUG  compiler flag. Additionally, in keeping with the C11 standard (ISO/IEC 9899:201xNULL is not assumed to be a zero of any type.

Exception Handling Patterns

Living with Exceptions

In Java, exceptions occur when some non-planned issue occurs that interrupts the normal flow of execution. Exception handling is accomplished in one of two general ways. The first is to simply allow the exception to propagate up the call stack. This is accomplished by adding the throws  keyword and a list of exception types to the method declaration. The second approach is to execute code that may throw  an exception inside of a try  block. Immediately following the try  block will be one or more catch  blocks that catch the different exception types that may occur in the preceding try block. The catch  block(s) may be followed by a finally  block that is used for cleanup.

Continue reading

Start getting up and running with Upstart

Recently, I needed to write a collection of upstart jobs to start the backend for a project at work. Upstart is a great tool for starting jobs in a particular sequence, but getting everything just right was challenging.

Continue reading

Welcome to Bespoke Bytes

This blog is an experiment in coding and writing. Writing has been a challenge during most of my career, despite writing of one form or another being a major professional activity. Please stay tuned for code examples, opinions, and the occasional rant.

Cheers — Hackworth.

© 2017 Bespoke Bytes

Theme by Anders NorénUp ↑