Learning Bilingual Distributed Phrase Representations for Statistical Machine Translation

Chaochao Wang, Deyi Xiong, Min Zhang, Chun Yu KIT

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

Following the idea of using distributed semantic representations to facilitate the computationof semantic similarity between translation equivalents, we propose a novel framework to learnbilingual distributed phrase representations for machine translation. We first induce vectorrepresentations for words in the source and target language respectively, in their own semanticspace. These word vectors are then used to create phrase representations via composition methods.In order to compute semantic similarity of phrase pairs in the same semantic space, weproject phrase representations from the source-side semantic space onto the target-side semanticspace via a neural network that is able to conduct nonlinear transformation between the twospaces. We integrate the learned bilingual distributed phrase representations into a hierarchicalphrase-based translation system to validate the effectiveness of our proposed framework.Experiment results show that our method is able to significantly improve translation qualityand outperform previous methods that only use word representations or linear semantic spacetransformation.
Original languageEnglish
Title of host publicationProceedings of MT Summit XV, vol.1: MT Researchers' Track
Pages32-43
Publication statusPublished - 30 Oct 2015
Event MT Summit XV, vol.1: MT Researchers' Track - Miami, United States
Duration: 30 Oct 20153 Nov 2015

Conference

Conference MT Summit XV, vol.1: MT Researchers' Track
PlaceUnited States
CityMiami
Period30/10/153/11/15

Fingerprint

Dive into the research topics of 'Learning Bilingual Distributed Phrase Representations for Statistical Machine Translation'. Together they form a unique fingerprint.

Cite this