Sentiment analysis of short informal texts

  1. (PDF, 697 KB)
  2. Get@NRC: Sentiment analysis of short informal texts (Opens in a new window)
DOIResolve DOI:
AuthorSearch for: ; Search for: ; Search for:
Journal titleJournal of Artifcial Intelligence Research
Pages723762; # of pages: 40
SubjectClassification (of information); Semantics; Text processing; Ablation experiments; Automatically generated; Percentage points; Sentiment analysis; Sentiment features; Sentiment lexicons; State-of-the-art
AbstractWe describe a state-of-the-art sentiment analysis system that detects (a) the sentiment of short informal textual messages such as tweets and SMS (message-level task) and (b) the sentiment of a word or a phrase within a message (term-level task). The system is based on a supervised statistical text classification approach leveraging a variety of surface-form, semantic, and sentiment features. The sentiment features are primarily derived from novel high-coverage tweet-specific sentiment lexicons. These lexicons are automatically generated from tweets with sentiment-word hashtags and from tweets with emoticons. To adequately capture the sentiment of words in negated contexts, a separate sentiment lexicon is generated for negated words. The system ranked first in the SemEval-2013 shared task `Sentiment Analysis in Twitter' (Task 2), obtaining an F-score of 69.02 in the message-level task and 88.93 in the term-level task. Post-competition improvements boost the performance to an F-score of 70.45 (message-level task) and 89.50 (term-level task). The system also obtains state-of-the-art performance on two additional datasets: the SemEval-2013 SMS test set and a corpus of movie review excerpts. The ablation experiments demonstrate that the use of the automatically generated lexicons results in performance gains of up to 6.5 absolute percentage points.
Publication date
AffiliationInformation and Communication Technologies; National Research Council Canada
Peer reviewedYes
NPARC number21275945
Export citationExport as RIS
Report a correctionReport a correction
Record identifierf3c48029-99e0-48c7-9aaf-271e9715465b
Record created2015-08-12
Record modified2016-05-09
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)