K-means+: an autonomous clustering algorithm

  1. (PDF, 384 KB)
DOIResolve DOI: http://doi.org/10.13140/RG.2.1.1113.7365
AuthorSearch for: ; Search for: ; Search for:
TypeTechnical Report
Physical description27 p.
AbstractThe traditional clustering algorithm, K-means, is famous for its simplicity and low time complexity. However, the usability of K-means is limited by its shortcoming that the clustering result is heavily dependent on the user-defined variants, i.e., the selection of the initial centroid seeds and the number of clusters (k). A new clustering algorithm, called K-means+, is proposed to extend K-means. The K-means+ algorithm can automatically determine a semi-optimal number of clusters according to the statistical nature of data; moreover, the initial centroid seeds are not critical to the clustering results. The experiment results on the Iris and the KDD-99 data illustrate the robustness of the K-means+ clustering algorithm, especially for a large amount of data in a high-dimensional space.
Publication date
AffiliationNRC Institute for Information Technology; National Research Council Canada
Peer reviewedNo
NPARC number21277103
Export citationExport as RIS
Report a correctionReport a correction
Record identifier33a96c39-4891-4815-bf19-1135a8984501
Record created2015-12-01
Record modified2016-10-03
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)