Learning from Multiple Partially Observed Views - an Application to Multilingual Text Categorization

  1. (PDF, 267 KB)
AuthorSearch for: ; Search for: ; Search for:
Proceedings titleAdvances in neural information processing systems 22 : 23rd Annual Conference on Neural Information Processing Systems 2009
Series titleAdvances in neural information processing systems; Volume 22
ConferenceThe 23rd Annual Conference on Neural Information Processing Systems, Vancouver, B.C., Canada, December 07-10, 2009
Pages2836; # of pages: 9
SubjectInformation and Communications Technologies
AbstractWe address the problem of learning classifiers when observations have multiple views, some of which may not be observed for all examples. We assume the existence of view generating functions which may complete the missing views in an approximate way. This situation corresponds for example to learning text classifiers from multilingual collections where documents are not available in all languages. In that case, Machine Translation (MT) systems may be used to translate each document in the missing languages. We derive a generalization error bound for classifiers learned on examples with multiple artificially created views. Our result uncovers a trade-off between the size of the training set, the number of views, and the quality of the view generating functions. As a consequence, we identify situations where it is more interesting to use multiple views for learning instead of classical single view learning. An extension of this framework is a natural way to leverage unlabeled multi-view data in semi-supervised learning. Experimental results on a subset of the Reuters RCV1/RCV2 collections support our findings by showing that additional views obtained fromMT may significantly improve the classification performance in the cases identified by our trade-off.
Publication date
PublisherCurran Associates, Incorporated
AffiliationNational Research Council Canada (NRC-CNRC); NRC Institute for Information Technology
Peer reviewedYes
NPARC number16067306
Export citationExport as RIS
Report a correctionReport a correction
Record identifier1c904446-c414-45b1-a948-d4600a7356df
Record created2010-11-03
Record modified2016-05-09
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)