Data Compression with Brotli

Data Compression with Brotli
Brotli + Python

You may have heard of brotli and wondered what it is exactly. "Brotli is a generic-purpose lossless compression algorithm ..." the sentence goes on, of course, but after that it gets relatively technical. So the important information is, you can use Brotli to compress data.

In fact, Brotli is now supported by all major browsers and also supported by the two major representatives of the web servers Nginx and Apache (see Wikipedia). In addition, Brotli is open source and available under the MIT License. This means you can have a look at the whole thing in Google's public GitHub repository.

For a complete flow for compressing and decompressing with Brotli in Python, I created a GitHub repository.

GitHub - Niecke/brotli_example
Contribute to Niecke/brotli_example development by creating an account on GitHub.

Compressing with Brotli

I would like to show the basic procedure of compressing with Brotli on a bit of Python code.

import brotli

test = "Hello world!"
test_compressed = brotli.compress(test.encode())

In fact, you don't need more than the 3 lines. Brotli's compress() function expects us to pass it at least one byte string. Therefore, we call the encode() function once on our string and pass the result for compression. We then get a compressed byte string which might take some time depending on the length of the string we passed.

This bite string can then be written to a file or sent over the network. I use Brotli for example to compress data that I send to an Apache Kafka. Since the Apache Kafka in my case is accessed over the internet, I save bandwidth and also memory inside the Apache Kafka.

Decompressing with Brotli

Decompressing data with Brotli is just as easy as compressing it. Assuming we still have the compressed string from the first example, the following code will do.

test_decompressed = brotli.decompress(test_compressed)

Note that test_compressed now contains a byte string and you have to convert it to a string using the decode() function if necessary. But that's all and we get back the original data.

Additional Notes

What was not shown in this short example is that you can give Brotli some more parameters. For example, there are 11 compression levels in Brotli, with level 1 being the fastest at compressing, but having the lowest compression ratio.
In addition, the functions of Brotli can throw an exception of the type brotli.error, this should be caught and handled in any case. Unfortunately this is a very general error, which is described in the documentation with "If arguments are invalid, or compressor fails."

As usual with compression algorithms, there are many different benchmarks and one should check well beforehand whether compression makes sense in the current use case.

Benchmarks