Transferring markup tags in statistical machine translation: a two-stream approach

AuthorSearch for: ; Search for: ; Search for: ; Search for:
Proceedings titleProceedings of the 2nd Workshop on Post-editing Technology and Practice (WPTP-2)
ConferenceMT Summit XIV Workshop on Post-editing Technology and Practice, September 2, 2013, Nice, France
AbstractTranslation agencies are introducing sta- tistical machine translation (SMT) into the work flow of human translators. Typ- ically, SMT produces a first-draft transla- tion, which is then post-edited by a per- son. SMT has met much resistance from translators, partly because of professional conservatism, but partly because the SMT community has often neglected some practical aspects of translation. Our paper discusses one of these: transferring formatting tags such as bold or italic from the source to the target document with a low error rate, thus freeing the post-editor from having to reformat SMT-generated text. In our “two-stream” approach, tags are stripped from the input to the decoder, then reinserted into the resulting target-language text. Tag trans- fer has been tackled by other SMT teams, but only a few have published descrip- tions of their work. This paper contrib- utes to understanding tag transfer by ex- plaining our approach in detail.
Publication date
AffiliationInformation and Communication Technologies; National Research Council Canada
Peer reviewedYes
NPARC number23000927
Export citationExport as RIS
Report a correctionReport a correction
Record identifiera7b782a4-ee8d-4e93-bc8a-ea188154fc35
Record created2016-11-16
Record modified2016-11-16
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)