Cost weighting for neural machine translation domain adaptation

AuthorSearch for: ; Search for: ; Search for: ; Search for:
TypeArticle
Proceedings titleProceedings of the First Workshop on Neural Machine Translation
ConferenceThe First Workshop on Neural Machine Translation, August 4, 2017, Vancouver, BC, Canada
Pages4046
AbstractIn this paper, we propose a new domain adaptation technique for neural machine translation called cost weighting, which is appropriate for adaptation scenarios in which a small in-domain data set and a large general-domain data set are available. Cost weighting incorporates a domain classifier into the neural machine translation training algorithm, using features derived from the encoder representation in order to distinguish in-domain from out-of-domain data. Classifier probabilities are used to weight sentences according to their domain similarity when updating the parameters of the neural translation model. We compare cost weighting to two traditional domain adaptation techniques developed for statistical machine translation: data selection and sub-corpus weighting. Experiments on two large data tasks show that both the traditional techniques and our novel proposal lead to significant gains, with cost weighting outperforming the traditional methods.
Publication date
PublisherAssociation for Computational Linguistics
LanguageEnglish
AffiliationInformation and Communication Technologies; National Research Council Canada
Peer reviewedYes
NPARC number23002215
Export citationExport as RIS
Report a correctionReport a correction
Record identifier328f63b3-c8d0-4c4a-bd21-47ef78e5e696
Record created2017-09-06
Record modified2017-09-06
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)