Computing lexical contrast

  1. (PDF, 363 KB)
  2. Get@NRC: Computing lexical contrast (Opens in a new window)
DOIResolve DOI:
AuthorSearch for: ; Search for: ; Search for: ; Search for:
Journal titleComputational Linguistics
Pages556589; # of pages: 34
AbstractKnowing the degree of semantic contrast between words has widespread application in natural language processing, including machine translation, information retrieval, and dialogue systems. Manually created lexicons focus on opposites, such as hot and cold. Opposites are of many kinds such as antipodals, complementaries, and gradable. Existing lexicons often do not classify opposites into the different kinds, however. They also do not explicitly list word pairs that are not opposites but yet have some degree of contrast in meaning, such as warm and cold or tropical and freezing. We propose an automatic method to identify contrasting word pairs that is based on the hypothesis that if a pair of words, A and B, are contrasting, then there is a pair of opposites, C and D, such that A and C are strongly related and B and D are strongly related. (For example, there exists the pair of opposites hot and cold such that tropical is related to hot, and freezing is related to cold.) We will call this the contrast hypothesis. We begin with a large crowdsourcing experiment to determine the amount of human agreement on the concept of oppositeness and its different kinds. In the process, we flesh out key features of different kinds of opposites. We then present an automatic and empirical measure of lexical contrast that relies on the contrast hypothesis, corpus statistics, and the structure of a Roget-like thesaurus. We show how, using four different data sets, we evaluated our approach on two different tasks, solving "most contrasting word" questions and distinguishing synonyms from opposites. The results are analyzed across four parts of speech and across five different kinds of opposites. We show that the proposed measure of lexical contrast obtains high precision and large coverage, outperforming existing methods.
Publication date
AffiliationInformation and Communication Technologies; National Research Council Canada
Peer reviewedYes
NPARC number21270401
Export citationExport as RIS
Report a correctionReport a correction
Record identifier7cfdb17a-8fca-4cdc-a63a-083d58385d6d
Record created2014-02-07
Record modified2016-05-09
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)