Evaluation Robust but Robust to What?

  1. (PDF, 208 KB)
AuthorSearch for:
ConferenceAAAI-07 Workshop on Methods for Machine Learning II, July 22, 2007., Vancouver, British Columbia, Canada
AbstractGeneralization is at the core of evaluation, we estimate the performance of a model on data we have never seen but expect to encounter later on. Our current evaluation procedures assume that the data already seen is a random sample of the domain from which all future data will be drawn. Unfortunately, in practical situations this is rarely the case. Changes in the underlying probabilities will occur and we must evaluate how robust our models to such differences. This paper takes the position that models should be robust in two senses. Firstly, that any small changes in the joint probabilities should not cause large changes in performance. Secondly, that when the dependencies between attributes and the class are constant and only the marginal change, simple adjustments should be sufficient to restore a model?s performance. This paper is intended to generate debate on how measures of robustness might become part of our normal evaluation procedures. Certainly some clear demonstrations of robustness would improve our confidence in our models? practical merits.
Publication date
AffiliationNRC Institute for Information Technology; National Research Council Canada
Peer reviewedNo
NRC number49344
NPARC number8914274
Export citationExport as RIS
Report a correctionReport a correction
Record identifierd372f9ff-e1b2-4547-8197-d11ba4fa57fe
Record created2009-04-22
Record modified2016-05-09
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)