Stabilizing Minimum Error Rate Training

  1. (PDF, 238 KB)
AuthorSearch for: ; Search for:
Proceedings titleProceedings of the Fourth Workshop on Statistical Machine Translation
ConferenceThe 4th Workshop on Statistical Machine Translation (EACL 2009), Athens, Greece, March 30-31, 2009
Pages242249; # of pages: 8
AbstractThe most commonly used method for training feature weights in statistical machine translation (SMT) systems is Och’s minimum error rate training (MERT) procedure. A well-known problem with Och’s procedure is that it tends to be sensitive to small changes in the system, particularly when the number of features is large. In this paper, we quantify the stability of Och’s procedure by supplying different random seeds to a core component of the procedure (Powell’s algorithm). We show that for systems with many features, there is extensive variation in outcomes, both on the development data and on the test data. We analyze the causes of this variation and propose modifications to the MERT procedure that improve stability while helping performance on test data.
Publication date
AffiliationNational Research Council Canada (NRC-CNRC); NRC Institute for Information Technology
Peer reviewedYes
NRC number50757
NPARC number16335066
Export citationExport as RIS
Report a correctionReport a correction
Record identifierfffb669c-87f6-4a2c-8bba-e292e723abe8
Record created2010-11-10
Record modified2016-05-09
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)