Keyphrase Extraction: Enhancing Lists

Download
  1. (PDF, 229 KB)
AuthorSearch for: ; Search for:
TypeArticle
ConferenceProceedings of the Computational Linguistic in the North-East (CLINE'2004), August 30, 2004., Montréal, Québec, Canada
Subjectkeyphrase extraction; clustering; semantic similarity; corpus linguistics; keyphrase evaluation
AbstractThis paper proposes some modest improvements to Extractor, a state-of-the-art keyphrase extraction system, by using a terabyte-sized corpus to estimate the informativeness and semantic similarity of keyphrases. We present two techniques to improve the organization and remove outliers of lists of keyphrases. The first is a simple ordering according to their occurrences in the corpus; the second is clustering according to semantic similarity. Evaluation issues are discussed. We present a novel technique of comparing extracted keyphrases to a gold standard which relies on semantic similarity rather than string matching or an evaluation involving human judges.
Publication date
LanguageEnglish
AffiliationNRC Institute for Information Technology; National Research Council Canada
Peer reviewedNo
NRC number48079
NPARC number5765134
Export citationExport as RIS
Report a correctionReport a correction
Record identifierbbdcb1d3-d36b-4f4f-9f56-2a613f0f4310
Record created2009-03-29
Record modified2016-05-09
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)