TRACE : A Fast Transformer-based General-Purpose Lossless Compressor

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

1 Scopus Citations
View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Original languageEnglish
Title of host publicationWWW '22
Subtitle of host publicationProceedings of the ACM Web Conference 2022
EditorsFrédérique Laforest, Raphaël Troncy, Elena Simperl, Deepak Agarwal, Aristides Gionis, Ivan Herman, Lionel Médini
PublisherAssociation for Computing Machinery
Pages1829-1838
ISBN (print)978-1-4503-9096-5
Publication statusPublished - Apr 2022

Publication series

NameWWW - Proceedings of the ACM Web Conference

Conference

Title31st ACM World Wide Web Conference, WWW 2022
PlaceFrance
CityVirtual, Online
Period25 - 29 April 2022

Abstract

Deep-learning-based compressor has received interests recently due to much improved compression ratio. However, modern approaches suffer from long execution time. To ease this problem, this paper targets on cutting down the execution time of deep-learning-based compressors. Building history-dependencies sequentially (e.g., recurrent neural networks) is responsible for long inference latency. Instead, we introduce transformer into deep learning compressors to build history-dependencies in parallel. However, existing transformer is too heavy in computation and incompatible to compression tasks.

This paper proposes a fast general-purpose lossless compressor, TRACE, by designing a compression-friendly structure based on a single-layer transformer. We first design a new metric to advise the selection part of compression model structures. Byte-grouping and Shared-ffn schemes are further proposed to fully utilize the capacity of the single-layer transformer. These features allow TRACE to achieve competitive compression ratio and a much faster speed. In addition, we further accelerate the compression procedure by designing a controller to reduce the parameter updating overhead. Experiments show that TRACE achieves an overall 3x speedup while keeps a comparable compression ratio to the state-of-the-art compressors. The source code for TRACE and links to the datasets are available at https://github.com/mynotwo/A-Fast-Transformer-based-General-Purpose-LosslessCompressor.

Research Area(s)

  • byte stream, computational efficient model, general-purpose compressor, lossless data compression, neural networks, transformer

Citation Format(s)

TRACE: A Fast Transformer-based General-Purpose Lossless Compressor. / Mao, Yu; Cui, Yufei; Kuo, Tei-Wei et al.
WWW '22: Proceedings of the ACM Web Conference 2022. ed. / Frédérique Laforest; Raphaël Troncy; Elena Simperl; Deepak Agarwal; Aristides Gionis; Ivan Herman; Lionel Médini. Association for Computing Machinery, 2022. p. 1829-1838 (WWW - Proceedings of the ACM Web Conference).

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review