cluster analysis

(redirected from Data clustering)
Also found in: Medical, Financial.

cluster analysis

[′kləs·tər ə′nal·ə·səs]
(statistics)
A general approach to multivariate problems whose aim is to determine whether the individuals fall into groups or clusters.

cluster analysis

a technique used to identify groups of objects or people that can be shown to be relatively distinct within a data set. The characteristics of those people within each cluster can then be explored. In market research, for example, cluster analysis has been used to identify groups of people for whom different marketing approaches would be appropriate.

There is a rich variety of clustering methods available. A common method is hierarchical clustering which can work either from ‘bottom up’ or from ‘top down’. In ‘agglomerative hierarchical clustering’ (i.e. bottom up), the process begins with as many ‘clusters’ as cases. Using a mathematical criterion such as the standardized Euclidean distance, objects or people are successively joined together into clusters. In ‘divisive hierarchical clustering’ (i.e. top down), the process starts with one single cluster containing all cases, which is then broken down into smaller clusters.

There are many practical problems involved in the use of cluster analysis. The selection of variables to be included in the analysis, the choice of distance measure and the criteria for combining cases into clusters are all crucial. Because the selected clustering method can itself impose a certain amount of structure on the data, it is possible for spurious clusters to be obtained. In general, several different methods should be used. (See Anderberg, 1973, and Everitt, 1974, for full discussions of methods.)

Mentioned in ?
References in periodicals archive ?
Firstly, it provides a comprehensive literature overview focusing on few frequently cited approaches [116]that have been applied for PSO-based data clustering.
Computer and information scientists and practitioners in application fields explore various aspects of multidimensional data clustering and analysis for graduate students and researchers who deal with advanced statistics.
Cluster analysis is another notation used for data clustering.
In this special issue, the selected papers focus on the topics of theory and applications of data clustering.
In this problem, of data clustering lies within five subgroups so the length of the element equals to 5.
This paper proposes an innovative method for data clustering to improve predictive performance.
3) Video data clustering using the extracted features: using clustering algorithm to efficiently build "visual words".
Gaussian mixture models are often used for data clustering.
To best knowledge this is the first effort toward a building block solution for the problem of privacy preserving data clustering.
Pseudo code for adaptive great deluge algorithm for medical data clustering problems Step--1: Initialization Phase Determine initial candidate solution [S.
Such a measure can be used to compare the performance of different data clustering algorithms on different real life datasets.
Contributors from fields where data clustering is applied explore three aspects of the phenomenon: core methods for data clustering, different problem domains and scenarios, and detailed insights from the clustering process.