Semi-supervised consensus clustering for gene expression data analysis

Download
  1. (PDF, 827 KB)
  2. Get@NRC: Semi-supervised consensus clustering for gene expression data analysis (Opens in a new window)
DOIResolve DOI: http://doi.org/10.1186/1756-0381-7-7
AuthorSearch for: ; Search for:
TypeArticle
Journal titleBioData Mining
ISSN1756-0381
Volume7
Issue1
Article number7
Pages113; # of pages: 13
SubjectSemi-supervised clustering; Consensus clustering; Semi-supervised consensus clustering; Gene expression
AbstractBackground: Simple clustering methods such as hierarchical clustering and k-means are widely used for gene expression data analysis; but they are unable to deal with noise and high dimensionality associated with the microarray gene expression data. Consensus clustering appears to improve the robustness and quality of clustering results. Incorporating prior knowledge in clustering process (semi-supervised clustering) has been shown to improve the consistency between the data partitioning and domain knowledge. Methods. We proposed semi-supervised consensus clustering (SSCC) to integrate the consensus clustering with semi-supervised clustering for analyzing gene expression data. We investigated the roles of consensus clustering and prior knowledge in improving the quality of clustering. SSCC was compared with one semi-supervised clustering algorithm, one consensus clustering algorithm, and k-means. Experiments on eight gene expression datasets were performed using h-fold cross-validation. Results: Using prior knowledge improved the clustering quality by reducing the impact of noise and high dimensionality in microarray data. Integration of consensus clustering with semi-supervised clustering improved performance as compared to using consensus clustering or semi-supervised clustering separately. Our SSCC method outperformed the others tested in this paper.
Publication date
LanguageEnglish
AffiliationInformation and Communication Technologies; National Research Council Canada
Peer reviewedYes
NPARC number21272892
Export citationExport as RIS
Report a correctionReport a correction
Record identifier4944bde4-ab82-453a-8eed-4442ede4f74d
Record created2014-12-03
Record modified2016-05-09
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)
Date modified: