The unreasonable effectiveness of word representations for Twitter named entity recognition

DOIResolve DOI: http://doi.org/10.3115/v1/N15-1075
AuthorSearch for: ; Search for:
TypeArticle
Proceedings titleProceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Conference2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, May 31-June 5,2015, Denver, Colorado, USA
ISBN978-1-941643-49-5
Article numberN15-1075
Pages735745
AbstractNamed entity recognition (NER) systems trained on newswire perform very badly when tested on Twitter. Signals that were reliable in copy-edited text disappear almost entirely in Twitter’s informal chatter, requiring the construction of specialized models. Using well understood techniques, we set out to improve Twitter NER performance when given a small set of annotated training tweets. To leverage unlabeled tweets, we build Brown clusters and word vectors, enabling generalizations across distributionally similar words. To leverage annotated newswire data, we employ an importance weighting scheme. Taken all together, we establish a new state-of-the-art on two common test sets. Though it is wellknown that word representations are useful for NER, supporting experiments have thus far focused on newswire data. We emphasize the effectiveness of representations on Twitter NER, and demonstrate that their inclusion can improve performance by up to 20 F1.
Publication date
PublisherAssociation for Computational Linguistics
LanguageEnglish
AffiliationInformation and Communication Technologies; National Research Council Canada
Peer reviewedYes
NPARC number23000026
Export citationExport as RIS
Report a correctionReport a correction
Record identifiere5c6c417-b6ac-46e9-bbb5-9b51f3ed233b
Record created2016-05-30
Record modified2016-05-30
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)