If you receive a defect report with a log file and instructions how to reproduce the issue on a customer system, you take a look at the log file first. If the log file is big searching manually doesn’t cut the mustard. For large enough log files even a decent log file viewer might not be good enough, e.g. if you need to identify lines with identical message IDs. So what to do about it?
Python’s generator expressions could be a tool worth adding to your belt. In a nutshell a generator expression is a way to iterate over lazily-evaluated data and to apply map-reduce operations. Let’s just have a look at an example:
def f(line):
return line.replace('foo', 'bar')
def g(line):
return line[::-1] # reverses the line
with open('some logfile') as logfile:
lines = (line for line in logfile) #(1)
reduced_lines = (line for line in lines if 'foo' in line) #(2)
mapped_lines = (f(line) for line in reduced_lines) #(3)
lines = (g(line) for line in mapped_lines if 'non' in line) #(4)
for line in lines: #(5)
print(line)
1 | The basic structure is like a slightly rearranged and compacted for-loop:(<iterator-based return value> for <iterator> in <range>) |
2 | To reduce simply add if <condition> after the for-loop |
3 | Mapping is done by modifying the <iterator-based return value> , e.g. by calling a function on it |
4 | Map and reduce combined |
5 | You can iterate once over a generator expression |
Generator expressions can be chained, i.e. a generator expression can be fed with elements of another generator expression. Thanks to lazy evaluation only the currently used elements must be loaded to memory. That’s why this approach works reasonably fast for large files.
When applying generator expressions on log files you can extract specific information and — for instance — generate a diagram to visualize matching lines with identical message IDs. Two caveats I’d like to point out: You can consume a generator expression once and only once. And you need to pick up functional programming.
To learn more about generator expressions check out David Beazley’s excellent Generator Tricks. And instead of reinventing the functional wheel check out the documentation.