An active learning approach for ensemble-based data stream mining

DOIResolve DOI: http://doi.org/10.5220/0006047402750282
AuthorSearch for: ; Search for: ; Search for:
TypeArticle
Proceedings titleProceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management
Conference8th International Conference on Knowledge Discovery and Information Retrieval, November 9-11, 2016, Porto, Portugal
ISBN978-989-758-203-5
Pages275282
AbstractData streams, where an instance is only seen once and where a limited amount of data can be buffered for processing at a later time, are omnipresent in today’s real-world applications. In this context, adaptive online ensembles that are able to learn incrementally have been developed. However, the issue of handling data that arrives asynchronously has not received enough attention. Often, the true class label arrives after with a time-lag, which is problematic for existing adaptive learning techniques. It is not realistic to require that all class labels be made available at training time. This issue is further complicated by the presence of late-arriving, slowly changing dimensions (i.e., late-arriving descriptive attributes). The aim of active learning is to construct accurate models when few labels are available. Thus, active learning has been proposed as a way to obtain such missing labels in a data stream classification setting. To this end, this paper introduces an active online ensemble (AOE) algorithm that extends online ensembles with an active learning component. Our experimental results demonstrate that our AOE algorithm builds accurate models against much smaller ensemble sizes, when compared to traditional ensemble learning algorithms. Further, our models are constructed against small, incremental data sets, thus reducing the number of examples that are required to build accurate ensembles.
Publication date
PublisherSCITEPRESS - Science and Technology Publications
LanguageEnglish
AffiliationInformation and Communication Technologies; National Research Council Canada
Peer reviewedYes
NPARC number23001168
Export citationExport as RIS
Report a correctionReport a correction
Record identifiere1406891-1212-456c-b4b0-5a02d2443351
Record created2016-12-21
Record modified2016-12-21
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)