cluster analysis

(redirected from Data clustering)
Also found in: Medical, Financial, Wikipedia.

cluster analysis

[′kləs·tər ə′nal·ə·səs]
(statistics)
A general approach to multivariate problems whose aim is to determine whether the individuals fall into groups or clusters.

cluster analysis

a technique used to identify groups of objects or people that can be shown to be relatively distinct within a data set. The characteristics of those people within each cluster can then be explored. In market research, for example, cluster analysis has been used to identify groups of people for whom different marketing approaches would be appropriate.

There is a rich variety of clustering methods available. A common method is hierarchical clustering which can work either from ‘bottom up’ or from ‘top down’. In ‘agglomerative hierarchical clustering’ (i.e. bottom up), the process begins with as many ‘clusters’ as cases. Using a mathematical criterion such as the standardized Euclidean distance, objects or people are successively joined together into clusters. In ‘divisive hierarchical clustering’ (i.e. top down), the process starts with one single cluster containing all cases, which is then broken down into smaller clusters.

There are many practical problems involved in the use of cluster analysis. The selection of variables to be included in the analysis, the choice of distance measure and the criteria for combining cases into clusters are all crucial. Because the selected clustering method can itself impose a certain amount of structure on the data, it is possible for spurious clusters to be obtained. In general, several different methods should be used. (See Anderberg, 1973, and Everitt, 1974, for full discussions of methods.)

Mentioned in ?
References in periodicals archive ?
Data clustering is an unsupervised process of classifying patterns into clusters, aiming at discovering structures hidden in a data set.
based Yodlee, a provider of digital financial solutions including personal financial management, presented YodleeSense, which utilizes behavioral psychology and data clustering.
Data clustering is a key technique in the field of data mining, pattern recognition, bioinformatics and machine learning which concerns the organization and unexplored relationship between the huge amounts of data.
To best knowledge this is the first effort toward a building block solution for the problem of privacy preserving data clustering.
We have used the Vortex package to quickly analyse data obtained from individual program series, helping to unearth trends not easily servable without the advanced visualisation, data clustering and data pivoting capabilities.
Contributors from fields where data clustering is applied explore three aspects of the phenomenon: core methods for data clustering, different problem domains and scenarios, and detailed insights from the clustering process.
To discover knowledge from dozens up to hundreds-dimensional spatial data clustering has become a challenge direction and difficult in the current course of the study.
muKMeans: Offers easy-to-use R functions for data clustering on Big Data using the K-means algorithm
Based on a general definition, data clustering algorithms can be classified into four categories; (1) partitioning, (2) hierarchical, (3) density-based and (4) grid-based.
Content covers such issues as: modeling and analysis of biological systems, including human crowds; the application of swarm intelligence models to problems associated with distributed computing, data clustering, and decision making; and, research in insect colonies, human behavior and swarm robotics.
The Database Utility for SQL Server integrates PolyServe's innovative shared data clustering technology to deliver the industry's first real consolidation solution for mission-critical SQL Server applications.
Matrix Manager puts the power of our shared data clustering solutions at the fingertips of systems managers," said Mike Stankey, CEO of PolyServe.