Capturing reliable fine-grained sentiment associations by crowdsourcing and best–worst scaling

Download
  1. (PDF, 363 KB)
DOIResolve DOI: http://doi.org/10.18653/v1/N16-1095
AuthorSearch for: ; Search for:
TypeArticle
Proceedings titleProceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Conference2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, June 2016, San Diego, California, USA
Pages811817
AbstractAccess to word-sentiment associations is useful for many applications, including sentiment analysis, stance detection, and linguistic analysis. However, manually assigning fine-grained sentiment association scores to words has many challenges with respect to keeping annotations consistent. We apply the annotation technique of Best-Worst Scaling to obtain real-valued sentiment association scores for words and phrases in four different domains: English Twitter, Arabic Twitter, English sentiment modifiers, and English opposing polarity phrases. We show that on all four domains the ranking of words by sentiment remains remarkably consistent even when the annotation process is repeated with a different set of annotators. We use these fine-grained word-sentiment associations in three ways. First, we analyze human perception of sentiment and calculate the minimal difference in sentiment that is detectable by native speakers. This least perceptible difference helps in our second objective: studying sentiment composition in phrases that include common sentiment modifiers (such as negators, modals, and degree adverbs) and in phrases that include words of opposing polarities. Changes in sentiment incurred in phrases are considered significant only if they exceed the least perceptible difference. Finally, as part of a SemEval-2016 shared task, we use the manually determined sentiment associations to evaluate automatically generated sentiment lexicons.
Publication date
PublisherAssociation for Computational Linguistics
LanguageEnglish
AffiliationInformation and Communication Technologies; National Research Council Canada
Peer reviewedYes
NPARC number23001911
Export citationExport as RIS
Report a correctionReport a correction
Record identifier18bb0659-18c8-434d-a4ff-b2279a2c76ec
Record created2017-05-24
Record modified2017-12-15
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)
Date modified: