A Practical Data-Driven Framework for Parallel Data Mining

Download
  1. (PDF, 255 KB)
AuthorSearch for: ; Search for:
TypeArticle
ConferenceProceedings of the 9th World Multi-Conference on Systemics, Cybernetics and Informatics (WMSCI 2005), July 10-13, 2005., Orlando, Florida, USA
Subjectparallel data mining; feature extraction; model evaluation or testing; JavaParty
AbstractIn many practical applications, data mining results must be quickly delivered. To achieve the required efficiency, without sacrificing the quality of the results, practitioners are now looking at ways to parallelize the most computationally expensive steps of the data mining process. Realizing that a complete rewriting of existing sequential programs into parallel ones is often too tedious and expensive, we propose a framework which re-uses existing sequential programs to perform parallel data mining on a computer cluster. The proposed framework relies on the JavaParty system and can be used to parallelize both Java and non-Java programs. This paper details the framework, illustrates the implementation, and presents early experimental results showing the benefits of the approach.
Publication date
LanguageEnglish
AffiliationNRC Institute for Information Technology; National Research Council Canada
Peer reviewedNo
NRC number47440
NPARC number5764990
Export citationExport as RIS
Report a correctionReport a correction
Record identifier7fc04822-e48f-4a7c-aa10-6c38d4fc218e
Record created2009-03-29
Record modified2016-05-09
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)