cluster analysis


Also found in: Dictionary, Thesaurus, Medical, Legal, Financial, Wikipedia.
Related to cluster analysis: factor analysis, Discriminant analysis

cluster analysis

[′kləs·tər ə′nal·ə·səs]
(statistics)
A general approach to multivariate problems whose aim is to determine whether the individuals fall into groups or clusters.

cluster analysis

a technique used to identify groups of objects or people that can be shown to be relatively distinct within a data set. The characteristics of those people within each cluster can then be explored. In market research, for example, cluster analysis has been used to identify groups of people for whom different marketing approaches would be appropriate.

There is a rich variety of clustering methods available. A common method is hierarchical clustering which can work either from ‘bottom up’ or from ‘top down’. In ‘agglomerative hierarchical clustering’ (i.e. bottom up), the process begins with as many ‘clusters’ as cases. Using a mathematical criterion such as the standardized Euclidean distance, objects or people are successively joined together into clusters. In ‘divisive hierarchical clustering’ (i.e. top down), the process starts with one single cluster containing all cases, which is then broken down into smaller clusters.

There are many practical problems involved in the use of cluster analysis. The selection of variables to be included in the analysis, the choice of distance measure and the criteria for combining cases into clusters are all crucial. Because the selected clustering method can itself impose a certain amount of structure on the data, it is possible for spurious clusters to be obtained. In general, several different methods should be used. (See Anderberg, 1973, and Everitt, 1974, for full discussions of methods.)

Mentioned in ?
References in periodicals archive ?
Cluster analysis with all couples for restaurant task.
Factor analysis has identified the inter-correlation of the variables, bringing them into a closed group having correlatable numeric responses, and in a similar fashion, cluster analysis has established the linkage tree as per the similarity level of the responses.
There are many key technologies in CASH methodology named normality test, cluster analysis, and ANOVA.
procedure called Cluster Analysis was used to group people into four
As stated previously, a cluster analysis was performed on the recoded SHR data in order to discern any pat terns inherent in the data.
Cluster analysis following UPGMA was conducted to access the genetic similarity (GS) among genotypes while Principal Coordinate Analysis was conducted to estimate the genetic variation and confirmation of results generated with cluster analysis.
More recent publications applying cluster analysis to convergence issues of the European Economic Area countries include Boreiko (International Journal of Finance and Economics, 2003), who use a fuzzy cluster analysis approach to estimate the readiness of central and eastern European countries for European monetary union accession.
Principal component analysis to overcome the collinearity between the original indexes, kept most of the main information of original index system, but when the variance contribution rates of the principal component is large, that ignore the important degree of the different principal component cluster differences as will inevitably affect the accuracy of the principal component cluster analysis.
Nonhierarchical cluster analysis was then performed on these two groups using all measures of persistency proposed, partial periods, peak yield, total milk yield and measures obtained at intervals of 30 days to certify the results of the correlations discussed above (Figure 2).
A cluster analysis requires researchers to make a series of independent decisions, which, in turn, require knowledge of its properties, the choice of similarity or dissimilarity of the various methods and a manner of validity, which may represent different groups.
Cluster analysis also allows us to minimize variability within clusters and maximize variability between clusters.
This consultation concerns the purchase, installation and commissioning of a cluster analysis and assimilation of complex data by the institute of computing and simulation (ics).