Severe Class Imbalance: Why Better Algorithms Aren't the Answer

  1. (PDF, 293 KB)
AuthorSearch for: ; Search for:
ConferenceProceedings of the 16th European Conference of Machine Learning, October 3-7, 2005., Porto, Portugal
AbstractThis paper argues that severe class imbalance is not just an interesting technical challenge that improved learning algorithms will address, it is much more serious. To be useful, a classifier must appreciably outperform a trivial solution, such as choosing the majority class. Any application that is inherently noisy limits the error rate, and cost, that is achievable. When data are normally distributed, even a Bayes optimal classifier has a vanishingly small reduction in the majority classifier's error rate, and cost, as imbalance increases. For fat tailed distributions, and when practical classifiers are used, often no reduction is achieved.
Publication date
AffiliationNRC Institute for Information Technology; National Research Council Canada
Peer reviewedNo
NRC number48258
NPARC number9190916
Export citationExport as RIS
Report a correctionReport a correction
Record identifierf0b7a37b-d7c5-470c-b6e5-eb7321281478
Record created2009-06-30
Record modified2016-05-09
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)