Abstract
Following the idea of using distributed semantic representations to facilitate the computationof semantic similarity between translation equivalents, we propose a novel framework to learnbilingual distributed phrase representations for machine translation. We first induce vectorrepresentations for words in the source and target language respectively, in their own semanticspace. These word vectors are then used to create phrase representations via composition methods.In order to compute semantic similarity of phrase pairs in the same semantic space, weproject phrase representations from the source-side semantic space onto the target-side semanticspace via a neural network that is able to conduct nonlinear transformation between the twospaces. We integrate the learned bilingual distributed phrase representations into a hierarchicalphrase-based translation system to validate the effectiveness of our proposed framework.Experiment results show that our method is able to significantly improve translation qualityand outperform previous methods that only use word representations or linear semantic spacetransformation.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of MT Summit XV, vol.1: MT Researchers' Track |
| Pages | 32-43 |
| Publication status | Published - 30 Oct 2015 |
| Event | MT Summit XV, vol.1: MT Researchers' Track - Miami, United States Duration: 30 Oct 2015 → 3 Nov 2015 |
Conference
| Conference | MT Summit XV, vol.1: MT Researchers' Track |
|---|---|
| Place | United States |
| City | Miami |
| Period | 30/10/15 → 3/11/15 |