Accelerating General-Purpose Lossless Compression via Simple and Scalable Parameterization

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

2 Scopus Citations
View graph of relations

Related Research Unit(s)

Detail(s)

Original languageEnglish
Title of host publicationMM '22 - Proceedings of the 30th ACM International Conference on Multimedia
Place of PublicationNew York
PublisherAssociation for Computing Machinery
Pages3205–3213
ISBN (print)978-1-4503-9203-7
Publication statusPublished - 2022

Conference

Title30th ACM International Conference on Multimedia (MM 2022)
Location
PlacePortugal
CityLisbon
Period10 - 14 October 2022

Abstract

The storage of multi-media data can benefit from the advancements in general-purpose lossless compression. The explosive growth of multi-media data volume in data centers demands a higher compression ratio and better compressors' run-time speed. However, recent deep-learning-based compressors with a high compression ratio usually build complicated dependencies on history symbols, leading to a long compression time. This paper investigates the behavior of historical symbols and finds an approximate order of importance. Namely, recent symbols have a substantially larger influence on the probability estimation of the next unknown symbol. This observation guides the designing of an interpretable structure for data compression, rather than learning implicitly from data like Recurrent Neural Network (RNN) and attention. Based on this observation, we disentangle the compression model into order learning and feature learning, which were fused in a large module in previous works. A parameterized ordered mask unit is established to learn the ordered importance of history symbols. A fast Multi-Layer Perceptron (MLP) network is designed for efficient feature learning. The proposed compressor can improve both compression performance and computational efficiency compared with transformer-based or RNN-based compressors. To further enhance computational efficiency, we propose a branch-MLP block to replace the original MLP layer. This block reduces the parameters and the FLOPs of the original MLP to a half, without sacrificing compression performance. Experiments on multi-media data demonstrate that our model improves the compression ratio by 10% on average across data domains while accelerating compression speed by 100% compared with the state-of-the-art. The source code and appendix are released at https://github.com/mynotwo/compressor_via_simple_and_scalable_parameterization.git.

Research Area(s)

  • multi-layer perceptron, general-purpose compressor, lossless data compression, neural networks, ordered importance, computational efficient

Citation Format(s)

Accelerating General-Purpose Lossless Compression via Simple and Scalable Parameterization. / Mao, Yu; Cui, Yufei; Kuo, Tei-Wei et al.
MM '22 - Proceedings of the 30th ACM International Conference on Multimedia. New York: Association for Computing Machinery, 2022. p. 3205–3213.

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review