Using WeBiText to Search Multilingual Web Sites

  1. (PDF, 597 KB)
AuthorSearch for: ; Search for: ; Search for: ; Search for:
ConferenceThe Ninth Conference of the Association for Machine Translation in the Americas (AMTA), October 21 - November 4, 2010, Denver, Colorado, USA
SubjectWeBiText; multilingual concordancer; Translation Memory; web parallel corpora
AbstractIn this paper, we describe WeBiText ( and how it is being used. WeBiText is a concordancer that allows translators to search in large, high-quality multilingual web sites, in order to find solutions to translation problems. After a quick overview of the system, we present results from an analysis of its logs, which provides a picture of how the tool is being used and how well it performs. We show that it is mostly used to find solutions for short, two or three word translation problems. The system produces at least one hit for 58% of the queries, and hits from at least five different web pages in 41% of cases. We show that 36% of the queries correspond to specialized language problems, which is much higher than what was previously reported for a similar concordancer based on the Canadian Hansard (TransSearch). We also provide a back of the envelope calculation of the current economic impact of the tool, which we estimate at $1 million per year, and growing rapidly.
Publication date
AffiliationNRC Institute for Information Technology; National Research Council Canada
Peer reviewedYes
NPARC number16959071
Export citationExport as RIS
Report a correctionReport a correction
Record identifier0ef63443-5a4e-442f-8944-118cc9ddd266
Record created2011-03-03
Record modified2016-05-09
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)