Warning: statistical benchmarking is addictive, kicking the habit in machine learning

Download
  1. Get@NRC: Warning: statistical benchmarking is addictive, kicking the habit in machine learning (Opens in a new window)
DOIResolve DOI: http://doi.org/10.1080/09528130903010295
AuthorSearch for: ; Search for:
TypeArticle
Journal titleJournal of Experimental and Theoretical Artificial Intelligence
ISSN0952-813X
1362-3079
Volume22
Issue1
Pages6780
Subjectmachine learning; algorithm evaluation; benchmarking; null hypothesis tests
AbstractAlgorithm performance evaluation is so entrenched in the machine learning community that one could call it an addiction. Like most addictions, it is harmful and very difficult to give up. It is harmful because it has serious limitations. Yet, we have great faith in practicing it in a ritualistic manner: we follow a fixed set of rules telling us the measure, the data sets and the statistical test to use. When we read a paper, even as reviewers, we are not sufficiently critical of results that follow these rules. Here, we will debate what are the limitations and how to best address them. This article may not cure the addiction but hopefully it will be a good first step along that road.
Publication date
PublisherTaylor & Francis
LanguageEnglish
AffiliationNational Research Council Canada; NRC Institute for Information Technology
Peer reviewedYes
NPARC number23002090
Export citationExport as RIS
Report a correctionReport a correction
Record identifiera579582d-6412-4e35-b434-e89e008e276a
Record created2017-08-10
Record modified2017-08-10
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)