Learning aspect models with partially labeled data

  1. (PDF, 209 KB)
  2. Get@NRC: Learning aspect models with partially labeled data (Opens in a new window)
DOIResolve DOI: http://doi.org/10.1016/j.patrec.2010.09.004
AuthorSearch for: ; Search for: ; Search for: ; Search for:
Journal titlePattern Recognition Letters
Pages297304; # of pages: 8
AbstractIn this paper, we address the problem of learning aspect models with partially labeled data for the task of document categorization. The motivation of this work is to take advantage of the amount of available unlabeled data together with the set of labeled examples to learn latent models whose structure and underlying hypotheses take more accurately into account the document generation process, compared to other mixture-based generative models. We present one semi-supervised variant of the Probabilistic Latent Semantic Analysis (PLSA) model (Hofmann, 2001). In our approach, we try to capture the possible data mislabeling errors which occur during the training of our model. This is done by iteratively assigning class labels to unlabeled examples using the current aspect model and re-estimating the probabilities of the mislabeling errors. We perform experiments over the 20Newsgroups, WebKB and Reuters document collections, as well as over a real world dataset coming from a Business Group of Xerox and show the effectiveness of our approach compared to a semi-supervised version of Naive Bayes, another semi-supervised version of PLSA and to transductive Support Vector Machines.
Publication date
AffiliationNRC Institute for Information Technology; National Research Council Canada
Peer reviewedYes
NPARC number16623731
Export citationExport as RIS
Report a correctionReport a correction
Record identifiere9279301-b218-4871-aeb7-ecb392422944
Record created2011-02-04
Record modified2017-03-23
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)
Date modified: