BE/BTech & ME/MTech Final Year Projects for Computer Science | Information Technology | ECE Engineer | IEEE Projects Topics, PHD Projects Reports, Ideas and Download | Sai Info Solution | Nashik |Pune |Mumbai
director@saiinfo settings_phone02536644344 settings_phone02048614848 settings_phone+919270574718 +919096813348 settings_phone+919028924212
logo


SAI INFO SOLUTION


Diploma | BE |B.Tech |ME | M.Tech |PHD

Project Development and Training

Search Project by Domainwise


NC-Link A New Linkage Method for Efficient Hierarchical Clustering of large scale Data


Prediction of Students Perform

Privacy-Assured Outsourcing of
Abstract


In various disciplines, hierarchical clustering (HC) has been an effective tool for data analysis due to its ability to summarize hierarchical structures of data in an intuitive and interpretable manner. A run of HC requires multiple iterations, each of which needs to compute and update the pairwise distances between all intermediate clusters. This makes the exact algorithm for HC inevitably suffer from quadratic time and space complexities. To address large-scale data, various approximate/parallel algorithms have been proposed to reduce the computational cost of HC. However, such algorithms still rely on conventional linkage methods (such as single, centroid, average, complete, or Ward's) for dening pairwise distances, mostly focusing on the approximation/parallelization of linkage computations. Given that the choice of linkage profoundly affects not only the quality but also the efciency of HC, we propose a newlinkage method named NC-link and design an exact algorithm for NC-link-based HC. To guarantee the exactness, the proposed algorithm maintains the quadratic nature in time complexity but exhibits only linear space complexity, thereby allowing us to address million-object data on a personal computer. To underpin the extensibility of our approach, we showthat the algorithmic nature of NC-link enables single instruction multiple data (SIMD) parallelization and subquadratic-time approximation of HC. To verify our proposal, we thoroughly tested it with a number of large-scale real and synthetic data sets. In terms of efciency, NC-link allowed us to perform HC substantially more space efciently or faster than conventional methods: compared with average and complete linkages, using NC-link incurred only 0.7Percent1.75Percent of the memory usage, and the NC-linkbased implementation delivered speedups of approximately 3.5 times over the centroid andWard's linkages. With regard to clustering quality, the proposed method was able to retrieve hierarchical structures from input data as faithfully as in the popular average and centroid linkage methods. We anticipate that the existing approximation/parallel algorithms will be able to benet from adopting NC-link as their linkage method for obtaining better clustering results and reduced time and space demands.

KeyWords
Clustering algorithm, data mining, machine learning.



Share
Share via WhatsApp
BE/BTech & ME/MTech Final Year Projects for Computer Science | Information Technology | ECE Engineer | IEEE Projects Topics, PHD Projects Reports, Ideas and Download | Sai Info Solution | Nashik |Pune |Mumbai
Call us : 09096813348 / 02536644344
Mail ID : developer.saiinfo@gmail.com
Skype ID : saiinfosolutionnashik