Skip to main navigation Skip to search Skip to main content

Causally Denoise Word Embeddings Using Half-Sibling Regression

Zekun Yang, Tianlin Liu*

*Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

Distributional representations of words, also known as word vectors, have become crucial for modern natural language processing tasks due to their wide applications. Recently, a growing body of word vector postprocessing algorithm has emerged, aiming to render off-the-shelf word vectors even stronger. In line with these investigations, we introduce a novel word vector postprocessing scheme under a causal inference framework. Concretely, the postprocessing pipeline is realized by Half-Sibling Regression (HSR), which allows us to identify and remove confounding noise contained in word vectors. Compared to previous work, our proposed method has the advantages of interpretability and transparency due to its causal inference grounding. Evaluated on a battery of standard lexical-level evaluation tasks and downstream sentiment analysis tasks, our method reaches state-of-the-art performance.
Original languageEnglish
Title of host publicationThe Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20)
Place of PublicationCalifornia
PublisherAAAI Press
Pages9426-9433
ISBN (Print)9781577358350 (set)
DOIs
Publication statusPublished - Feb 2020
Event34th AAAI Conference on Artificial Intelligence (AAAI-20) - New York, United States
Duration: 7 Feb 202012 Feb 2020
https://aaai.org/Conferences/AAAI-20/
https://aaai.org/ojs/index.php/AAAI/index

Publication series

NameProceedings of the AAAI Conference on Artificial Intelligence
PublisherAAAI Press
Number5
Volume34
ISSN (Print)2159-5399
ISSN (Electronic)2374-3468

Conference

Conference34th AAAI Conference on Artificial Intelligence (AAAI-20)
PlaceUnited States
CityNew York
Period7/02/2012/02/20
Internet address

Fingerprint

Dive into the research topics of 'Causally Denoise Word Embeddings Using Half-Sibling Regression'. Together they form a unique fingerprint.

Cite this