Analysis and Comparison of Data Compression Algorithms
Keywords:Compression, Compression Ratio, MATLAB, Compression Speed, Decompression Speed, Canterbury Corpus, Silesia Corpus.
The amount of data being shared over the internet is increasing exponentially. In this digital age, where even devices like refrigerators are connected, data needs to be stored in compressed form. The compressed data should be retrieved without loss of information else the data will be deemed corrupt. As we are approaching 5G communication, the data need to be transferred over the internet at a higher rate. This cannot be achieved by older compression algorithms which has lesser compression ratio and even lesser compression and decompression speed. In this paper, an analysis of modern compression algorithms along with some older compression algorithms have been conducted. Also the implementation and comparison is conducted. The comparison was done with the help of graphs plotted using MATLAB soft- ware. The compression algorithms compared were Deflate, bzip2, Zstandard, Brotli, LZ4 and LZO. The files used for compression were taken from Canterbury and Silesia Corpus.
 P. Deutsch (1996) DEFLATE Compressed Data Format Specification version 1.3 [Online]. Available: https://tools.ietf.org/html/rfc1951. https://doi.org/10.17487/rfc1951.
 The Silesia corpus, http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia.
 LZO Compression library, https://boutell.com/lsm/lsmbyid.cgi/001070.
 The Canterbury corpus, http://corpus.canterbury.ac.nz/.
 The Calgary corpus, http://www.data- compression.info/Corpora/.
 David Solomon, Data Compression, The Complete Reference, 3rd edition, Springers Publication, 2003.
 Vlad Krasnov, Results of experimenting with Brotli for dynamic web content, https://blog.cloudflare.com/results- experimenting-brotli/
 Fano, R.M.â€The transmission of informationâ€. Technical Report No. 65, USA: Research Laboratory of Electronics at MIT, 1949.
 Ziv, Jacob; Lempel, Abraham. â€A Universal Algorithm for Sequential Data Compressionâ€. IEEE Transactions on Information Theory23 (3): pp. 337343, May 1977. https://doi.org/10.1109/TIT.1977.1055714.
 Huffman, D.â€ A Method for the Construction of Minimum- Redundancy Codesâ€. Proceedings of the IRE40 (9): pp. 10981101, 1952. https://doi.org/10.1109/JRPROC.1952.273898.
 Michael Burrows, David Wheeler, A Block-sorting Lossless Data Com-pression Algorithm, Digital Systems Research Center, Research Report 124, 1994.
 Zstandard A Stronger Compression Algorithm, http://fastcompression.blogspot.in/2015/01/zstd-stronger- compression-algorithm.html.
 LZ4 explained, http://fastcompression.blogspot.in/2011/05/lz4- explained.html.