Feature selection for high-dimensional class-imbalanced data sets using Support Vector Machines

Download
  1. Get@NRC: Feature selection for high-dimensional class-imbalanced data sets using Support Vector Machines (Opens in a new window)
DOIResolve DOI: http://doi.org/10.1016/j.ins.2014.07.015
AuthorSearch for: ; Search for: ; Search for:
TypeArticle
Journal titleInformation Sciences
ISSN0020-0255
Volume286
Pages228246; # of pages: 19
SubjectFeature selection; Imbalanced data set; Dimensionality reduction; Support vector machine; Data mining
AbstractFeature selection and classification of imbalanced data sets are two of the most interesting machine learning challenges, attracting a growing attention from both, industry and academia. Feature selection addresses the dimensionality reduction problem by determining a subset of available features to build a good model for classification or prediction, while the class-imbalance problem arises when the class distribution is too skewed. Both issues have been independently studied in the literature, and a plethora of methods to address high dimensionality as well as class-imbalance has been proposed. The aim of this work is to simultaneously explore both issues, proposing a family of methods that select those attributes that are relevant for the identification of the target class in binary classification. We propose a backward elimination approach based on successive holdout steps, whose contribution measure is based on a balanced loss function obtained on an independent subset. Our experiments are based on six highly imbalanced microarray data sets, comparing our methods with well-known feature selection techniques, and obtaining a better prediction with consistently fewer relevant features.
Publication date
LanguageEnglish
AffiliationInformation and Communication Technologies; National Research Council Canada
Peer reviewedYes
NPARC number21272943
Export citationExport as RIS
Report a correctionReport a correction
Record identifierb2f7ea62-d0cc-4a85-b613-3f6a3d43e1eb
Record created2014-12-03
Record modified2016-05-09
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)
Date modified: