Adaptive Label Smoothing for Classifier-based Mutual Information Neural Estimation

Xu Wang, Ali Al-Bashabsheh, Chao Zhao, Chung Chan

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

6 Citations (Scopus)

Abstract

Estimating the mutual information (MI) by neural networks has achieved significant practical success, especially in representation learning. Recent results further reduced the variance in the neural estimation by training a probabilistic classifier. However, the trained classifier tends to be overly confident about some of its predictions, which results in an overestimated MI that fails to capture the desired representation. To soften the classifier, we propose a novel scheme that smooths the label adaptively according to how extreme the probability estimates are. The resulting MI estimate is unbiased under a mild assumption on the model. Experimental results on MNIST and CIFAR10 datasets confirmed that our method yields better representation and achieves higher classification test accuracy among existing approaches in self-supervised representation learning.
Original languageEnglish
Title of host publication2021 IEEE International Symposium on Information Theory
Subtitle of host publicationProceedings
PublisherIEEE
Pages1035-1040
ISBN (Electronic)978-1-5386-8209-8
ISBN (Print)978-1-5386-8210-4
DOIs
Publication statusPublished - 2021
Event2021 IEEE International Symposium on Information Theory (ISIT 2021) - Virtual, Melbourne, Australia
Duration: 12 Jul 202120 Jul 2021
https://2021.ieee-isit.org/TechnicalProgram.asp
https://2021.ieee-isit.org/default.asp

Publication series

NameIEEE International Symposium on Information Theory - Proceedings
ISSN (Print)2157-8095

Conference

Conference2021 IEEE International Symposium on Information Theory (ISIT 2021)
Country/TerritoryAustralia
CityMelbourne
Period12/07/2120/07/21
Internet address

Fingerprint

Dive into the research topics of 'Adaptive Label Smoothing for Classifier-based Mutual Information Neural Estimation'. Together they form a unique fingerprint.

Cite this