cluster analysis

(redirected from Data clustering)
Also found in: Medical, Financial.

cluster analysis

[′kləs·tər ə′nal·ə·səs]
(statistics)
A general approach to multivariate problems whose aim is to determine whether the individuals fall into groups or clusters.

cluster analysis

a technique used to identify groups of objects or people that can be shown to be relatively distinct within a data set. The characteristics of those people within each cluster can then be explored. In market research, for example, cluster analysis has been used to identify groups of people for whom different marketing approaches would be appropriate.

There is a rich variety of clustering methods available. A common method is hierarchical clustering which can work either from ‘bottom up’ or from ‘top down’. In ‘agglomerative hierarchical clustering’ (i.e. bottom up), the process begins with as many ‘clusters’ as cases. Using a mathematical criterion such as the standardized Euclidean distance, objects or people are successively joined together into clusters. In ‘divisive hierarchical clustering’ (i.e. top down), the process starts with one single cluster containing all cases, which is then broken down into smaller clusters.

There are many practical problems involved in the use of cluster analysis. The selection of variables to be included in the analysis, the choice of distance measure and the criteria for combining cases into clusters are all crucial. Because the selected clustering method can itself impose a certain amount of structure on the data, it is possible for spurious clusters to be obtained. In general, several different methods should be used. (See Anderberg, 1973, and Everitt, 1974, for full discussions of methods.)

Mentioned in ?
References in periodicals archive ?
Firstly, it provides a comprehensive literature overview focusing on few frequently cited approaches [116]that have been applied for PSO-based data clustering. Secondly, the reported results and the performances of different techniques against existing clustering techniques were pinpointed.
In order to observe the performance of the IT2FPCM algorithm against the IT2FCM algorithm we perform the data clustering of the datasets mentioned above with both algorithms mentioned above to compare the results obtained by these algorithms, and to measure the performance of these algorithms we use the validation indices mentioned in the previous section.
Jang, "Data Clustering and Pattern Recognition," http:// mirlab.org/jang/.
After the completion of the data clustering, the cluster is characterized by using the four types of reference data which are shown in Figure 6(d) with different colors.
Data clustering starts by extracting the time-domain features from the data.
Data clustering is a data mining and data analysis method, that produces refined views to the in-built structure of a data set by separating it into a number of disjoints or overlapping classes.
While this paper is focused on the application of data clustering methodology and drive cycle creation, the researchers felt it was valuable to include the distribution of diesel fuel consumed by trips in each cluster.
The overview of the data clustering and the functionality
Data clustering is the process of identifying clusters or natural groups based on similarity measures.
The Redwood City, Calif.-based Yodlee, a provider of digital financial solutions including personal financial management, presented YodleeSense, which utilizes behavioral psychology and data clustering. The solution helps automate chores, makes personalized recommendations and provides advice.
The characteristic of this article is: every detail is analyzed from rom the generation of customer power load to data clustering.
Existing studies have pointed to data clustering as a potential solution to reduce heterogeneity, and therefore increase prediction accuracy.