This post is the first in a series on using the logging facilities in Python. Logging is an incredibly powerful tool for developers during early system development, and also during deployment.
One of the interesting features of the logging package is an optimization where rendering of the log message is delayed as long as possible. What's not obvious is how this optimization will impact execution time.
In order to understand the consequences of how we construct the arguments to a call to a logger, we're going to run three experiments. The first experiment will explore logging performance with a single logger, the second will use two loggers, and the third will use three loggers.
Logging Configuration
In order to study only the effect of the arguments on logging performance, we use a custom logging.Handler
subclass that performs no I/O, but does go through the motions of preparing the text for a log message.
1 2 3 4 5 6 7 | class FormattingNullHandler(logging.Handler): """Custom logging.Handler for performance measurements.""" def emit(self, record): """Ensure that the record is formatted, but avoid doing I/O. This ensures that our expensive argument processing always occurs.""" ignored_result = self.format(record) |
Logging was configured as follows for the single handler experiment:
1 # Configure logging
2 logger = logging.getLogger("")
3 logging._srcfile = None # Disable file lookup.
4 logging.logThreads = 0 # Disable logging thread information
5 logging.logProcesses = 0 # Disable logging process information
6 logging.logMultiprocessing = 0 # Disable logging multiprocess information
7
8 # Configure a single log handler with a minimalist formatter.
9 handler = FormattingNullHandler()
10 formatter = logging.Formatter('%(message)s')
11 handler.setFormatter(formatter)
12 logger.addHandler(handler)
Experimental Code
1 def exercise_driver():
2 """Exercise the logging system, with different reporting levels."""
3 results = {}
4 for log_level in LOGGING_LEVELS:
5 level = logging.getLevelName(log_level)
6 logger.setLevel(log_level)
7 results[level] = exercise_log()
8 return results
9
10
11 def exercise_log(iteration_limit=10, expensive_iterations=2500000):
12 """Run the experiment with a logger that has been externally configured."""
13 timer = Timer()
14 running_total = 0
15 iteration = 0
16 expensive_operation = ExpensiveCounter(expensive_iterations=expensive_iterations)
17
18 while iteration < iteration_limit:
19 with timer, expensive_operation:
20 for level in LOGGING_LEVELS:
21 logger.log(level, "%s", expensive_operation)
22 running_total += timer.time()
23 iteration += 1
24
25 object_counter = expensive_operation.count
26 object_time = running_total / iteration_limit
27
28 running_total = 0
29 iteration = 0
30 while iteration < iteration_limit:
31 with timer, expensive_operation:
32 for level in LOGGING_LEVELS:
33 logger.log(level, "%s", expensive_operation())
34 running_total += timer.time()
35 iteration += 1
36
37 invoked_counter = expensive_operation.count
38 invoked_time = running_total / iteration_limit
39
40 return {'invoked_count': invoked_counter,
41 'invoked_time': invoked_time,
42 'obj_count': object_counter,
43 'obj_time': object_time}
The experimental code makes 2 logging calls in exercise_log
. The first call, on line 21, passes an expensive_operation
object reference as an argument to the logging call, while the second call, on line 33, invokes the expensive_operation
object, passing the result to the logging call. In both cases the logging call is made for each of the standard log levels. The exercise_log
function is called for each of the standard log levels. The expensive_operation
object is just a while loop that counts from 0 to expensive_iterations
by 1s when it is invoked or converted to a string.
Tests
The initial test uses one log handler and has the following results:
Level Filter | Obj Ref | Avg. Time | Method Call | Avg. Time |
---|---|---|---|---|
CRITICAL | 1 | 0.0434 | 5 | 0.2182 |
ERROR | 2 | 0.0861 | 5 | 0.2177 |
WARNING | 3 | 0.1294 | 5 | 0.2177 |
INFO | 4 | 0.1724 | 5 | 0.2179 |
DEBUG | 5 | 0.2427 | 5 | 0.2450 |
The first item of note is the Obj Ref column, which is the count of the number of times the object was converted to a string. In keeping with the behavior that message formatting is deferred as long as possible, we see that the string conversion only happened when the message would be rendered under the logging level configuration. When we passed the result of invoking the object, we performed the expensive operation every time, resulting in longer execution times.
The second test uses two handlers associated with a single logger and had the following results:
Level Filter | Obj Ref | Avg. Time | Method Call | Avg. Time |
---|---|---|---|---|
CRITICAL | 2 | 0.0864 | 5 | 0.2175 |
ERROR | 4 | 0.1732 | 5 | 0.2188 |
WARNING | 6 | 0.2600 | 5 | 0.2202 |
INFO | 8 | 0.3466 | 5 | 0.2186 |
DEBUG | 10 | 0.4310 | 5 | 0.2181 |
The third test uses three handlers associated with a single logger and had the following results:
Level Filter | Obj Ref | Avg. Time | Method Call | Avg. Time |
---|---|---|---|---|
CRITICAL | 3 | 0.1295 | 5 | 0.2174 |
ERROR | 6 | 0.2595 | 5 | 0.2174 |
WARNING | 9 | 0.3890 | 5 | 0.2174 |
INFO | 12 | 0.5198 | 5 | 0.2180 |
DEBUG | 15 | 0.6494 | 5 | 0.2183 |
Looking at the second and third tests we see that the number of times we convert an object to a string increases with the number of handlers. The initial test might have given the impression that object references are better, but we can see that the situation is more complicated.
Discussion
As we can see from the test results, minimizing the amount of time we spend in logging calls is not simple. We need to be mindful of the number of handlers, the logging level, and the cost of preparing the arguments for the logger. In general it will be better to have the arguments created once for a logging call, avoiding the need to perform expensive operations for each handler.
The logging cookbook recommends using the logger.isEnabledFor(level)
function to decide if a logging call should be made. This practice is unsatisfying as it encourages the use of expensive functions as arguments to logging calls. A better option would be to refactor your code so that the generation of arguments for logging calls is inexpensive, using the logger.isEnabledFor(level)
method only when inexpensive arguments cannot be used.
An important caveat for logging calls, regardless of how computational expense is managed, is that they should not alter system state. If it is necessary to alter system state via functions that generate input for logging arguments, then those functions should be idempotent.