Go Library for Parallel Compression and Decompression

Golang API for programmatically generating as well as reading standard GZIP files. Compress large files by splitting it into blocks and perform compression/decompression in parallel. 

pgzip is an open source library that provides complete functionality for parallel compression and decompression using Go language. The library is very useful for compressing a large amount of data as it is divided into blocks and performed compression/decompression in parallel. The pgzip library is incredibly popular among the developer community and allows Go apps to directly read compressed files with just a couple of commands.

The library is very stable and allows developers to programmatically generate as well as read standard GZIP files. To get the best out of the library it is recommended to use compress or decompress a big amount of data (more than 2MB at a time). The library has included support for several important features such as compress files, decompress files, open and read GZIP files and much more.

Previous Next

Getting Started with pgzip

The recommended way to install pgzip is from GitHub, please use the following command for smooth installation.

Install pgzip via command

go get github.com/klauspost/pgzip/...

Compress Large Files via Go API

The open source pgzip library has included functionality for compressing large amounts of data files using a couple of lines of Go code. The API supports the splitting of the large file into small parts (by default the block size is 1MB) and can be processed up to the number of CPU threads.  You can easily control the size of the blocks as well as customize it according to your needs and how many you want to be processed in parallel. For better performance gains, it is advised that users at least be compressing more than 1 megabyte of data at a time.

Decompressing Files via Go API

The free pgzip library enables software developers to decompress their files inside their own Go applications.  Same as compression the decompression can also be performed by customizing the block size. You can easily get your own reader and specify your own readahead.  For your reader, you need to define the block size and the maximum number of blocks that are going to be decoded ahead.

Performance Improvement

The performance of pgzip can be improved as compared to gzip when you have big amounts of data. As pgzip processes blocks in parallel, it obviously has a speed advantage on the other compressors. Use for high throughput, high compression material, like logs, JSON, CSV data can also be useful.  One great advantage of pgzip while decompression is it allows you to do other work while the decompression is taking place.