multivariate analysis


Also found in: Dictionary, Thesaurus, Medical, Legal, Financial, Wikipedia.

multivariate analysis

[¦məl·tē′ver·ē·ət ə′nal·ə·səs]
(statistics)
The study of random variables which are multidimensional.

multivariate analysis

the analysis of data collected on several different VARIABLES. For example, in a study of housing provision, data may be collected on age, income, family size (the ‘variables’) of the population being studied. In analysing the data the effect of each of these variables can be examined, and also the interaction between them.

There is a wide range of multivariate techniques available but most aim to simplify the data in some way in order to clarify relationships between variables. The choice of method depends on the nature of the data, the type of problem and the objectives of the analysis. FACTOR ANALYSIS and principle component analysis are exploratory, and used to find new underlying variables. CLUSTER ANALYSIS seeks to find natural groupings of objects or individuals. Discriminant analysis is a technique designed to clarify the differentiation between groups influenced by the independent variable(s). Other techniques, e.g. multiple REGRESSION ANALYSIS, aim to explain the variation in one variable by means of the variation in two or more independent variables. MANOVA (multivariate analysis of variance), an extension of the univariate ANALYSIS OF VARIANCE, is used when there are multiple independent variables, as in the example above. An example of multivariate techniques for analysing categorical data is LOG-LINEAR ANALYSIS.

Multivariate Analysis

 

the branch of mathematical statistics dealing with methods of studying statistical data concerning objects for which more than one quantitative or qualitative characteristic is measured. In the area of multivariate analysis that has been most intensively investigated, it is assumed that the results of individual observations are independent and obey the same multivariate normal distribution. The term “multivariate analysis” is sometimes applied in a narrow sense to this area.

In more precise language, multivariate analysis deals with data where the result Xj of the l th observation can be expressed in terms of the vector Xj = (Xj1, Xj2, . . . , Xjs). Here, the random variable Xjk has the mathematical expectation μk and variance Multivariate Analysis, and the correlation coefficient between Xjk and Xjl is ρkl. Of great importance are the mathematical expectation vector μ = (μl, . . . ,μs) and the covariance matrix Σ with elements σkσlσkl, where k, 1 = 1, . . . , s. This vector and matrix define completely the distribution of the vectors Xl . . . , Xn, which are the results of n independent observations. The choice of the multivariate normal distribution as the principal mathematical model for multivariate analysis can be justified in part by the following considerations: on the one hand, this model can be used in a great number of applications; on the other hand, only within the framework of this model can exact distributions of sample characteristics be calculated. The sample mean

and the sample covariance matrix

are maximum likelihood estimators of_the corresponding parameters of the population; here, (Xj)′ is the transpose of (Xj) (seeMATRIX). The distribution of is normal (µ,∑/n). The joint distribution of the elements of the covariance matrix S, known as the Wishart distribution, is a natural generalization of the chi-square distribution and plays an important role in multivariate analysis.

A number of problems in multivariate analysis are more or less analogous to the corresponding univariate problems—for example, the problem of testing hypotheses on the equality of the means in two independent samples. Examples of other problems are the testing of hypotheses on the independence of particular groups of components of the vectors Xj and the testing of such special hypotheses as the spherical symmetry of the distribution of the Xj The need to understand the complicated relationships between the components of the random vectors Xj leads to new problems. The method of the principal components and the method of canonical correlations are used to reduce the number of random characteristics—that is, the number of dimensions—under consideration or to reduce the characteristics to independent random variables.

In the method of principal components, the vectors Xj are carried by a transformation into the vectors Yj = (Yjl..... , Yjr). The components of the Yj are chosen such that Yjl has the maximum variance among the normalized linear combinations of the components of X1, Yj2 has the maximum variance among the linear functions of the components of X1 uncorrelated with Yj1, and so on.

In the method of canonical correlations, two sets of random variables (components of Xj) are replaced by smaller sets. First, linear combinations, one from each set of variables, are constructed so as to have maximum simple correlation with each other. These linear combinations are called the first pair of canonical variables, and their correlation is the first canonical correlation. The process is continued with the construction of further pairs of linear combinations. It is required, however, that each new canonical variable be uncorrelated with all previous ones. The method of canonical correlations indicates the maximum correlation between linear functions of two groups of components of the observation vector.

The results of the method of principal components and the method of canonical correlations contribute to an understanding of the structure of the multivariate population under consideration. Also of use in this regard is factor analysis, in which the components of the random vectors Xj are assumed to be linear functions of some unobserved factors that are to be studied.

Multivariate analysis also deals with the problem of differentiating two or more populations from the results of observations. One aspect of this problem is known as the discrimination problem and consists in the assignment of a new element to one of several populations on the basis of an analysis of samples of the populations. Another aspect involves dividing the elements of a population into groups that differ maximally, in some sense, from each other.

REFERENCES

Anderson, T. Vvedenie v mnogomernyi statisticheskii analiz. Moscow, 1963. (Translated from English.)
Kendall, M. G., and A. Stuart. The Advanced Theory of Statistics, vol. 3. London, 1966.
Dempster, A. P. Elements of Continuous Multivariate Analysis. London, 1969.

A. V. PROKHOROV

References in periodicals archive ?
If you are looking for a text that covers the mathematical theory of multivariate statistical analysis, this book will serve you well, but if you are looking for a text that addresses how to analyze and interpret multivariate data, then a text on multivariate data analysis (as opposed to multivariate statistical analysis) such as Computer-Aided Multivariate Analysis by Afifi, May, and Clark, or Applied Multivariate Data Analysis by Everitt and Dunn would probably be a better choice.
This information was then processed by the computer using a multivariate analysis program called principal component analysis.
Multivariate analysis revealed that low family income was the strongest predictor of feeling lonely (odds ratio, 2.
A multivariate analysis of factors linked with atrial fibrillation in these patients indicated that left atrial size mediated the obesity link, and that higher body mass index significantly correlated with larger left atrial size, Dr.
With each chapter first discussing theory and then presenting applications, the text covers unrestricted general linear models, restricted general linear models, weighted general linear models, multivariate general linear models, doubly multivariate linear models, the restricted multivariate general linear model and the growth curve model, the seemingly unrelated regression model and restricted generalized multivariate analysis of variance model, simultaneous interference using finite intersection tests, computing power for univariate and multivariate general linear models, two-level hierarchical models, incomplete repeated measurement data, and structural equation modeling.
In a multivariate analysis, women's likelihood of acquiring an HPV infection decreased significantly as the frequency of condom use increased.
After multivariate analysis, only the risk associated with eating green olives remained significant (RR 5.
In a multivariate analysis with adjustment for the conventional stroke risk factors, women with a baseline history of migraine with aura proved to be at a statistically significant 53% increased risk of total stroke and a 70% increased risk of ischemic stroke, compared with women without a history of headache.
This trend remained significant even after controlling for body mass index in the multivariate analysis.

Full browser ?