Unsupervised Modeling of Twitter Conversations

  1. (PDF, 287 KB)
AuthorSearch for: ; Search for: ; Search for:
ConferenceHuman Language Technologies: The 11th Annual Conference of the North American Chapter of the Association for Computational Linguistic, June 1-6, 2010, Los Angeles, California
AbstractWe propose the first unsupervised approach to the problem of modeling dialogue acts in an open domain. Trained on a corpus of noisy Twitter conversations, our method discovers dialogue acts by clustering raw utterances. Because it accounts for the sequential behaviour of these acts, the learned model can provide insight into the shape of communication in a new medium. We address the challenge of evaluating the emergent model with a qualitative visualization and an intrinsic conversation ordering task. This work is inspired by a corpus of 1.3 million Twitter conversations, which will be made publicly available. This huge amount of data, available only because Twitter blurs the line between chatting and publishing, highlights the need to be able to adapt quickly to a new medium.
Publication date
AffiliationNRC Institute for Information Technology; National Research Council Canada
Peer reviewedYes
NPARC number16885300
Export citationExport as RIS
Report a correctionReport a correction
Record identifier041d8899-ab44-4d85-b754-53ea717d199f
Record created2011-02-22
Record modified2016-05-09
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)