# factor analysis

Also found in: Dictionary, Thesaurus, Medical, Legal, Financial, Acronyms, Wikipedia.

## factor analysis

[′fak·tər ə‚nal·ə·səs]## factor analysis

a MULTIVARIATE statistical technique in which the covariances (or CORRELATIONS) between a large set of observed VARIABLES are explained in terms of a small number of new variables called factors. The ideas originated in the work on correlation by Galton and Spearman, and were developed primarily in studies of intelligence. Most applications are found in psychology and sociology.The technique is ‘variable directed’, with no distinction between INDEPENDENT and DEPENDENT VARIABLES in the data set. There are four steps to the analysis. The first is to derive a correlation matrix in which each variable in the data set is correlated with all the other variables. The next step is to extract the factors. The aim of this stage is to determine the minimum number of factors that can account adequately for the observed correlations between the original variables. If the number of factors identified is close to the number of original variables, there is little point to the factor analysis. Sometimes it is difficult to assign a meaningful name to the factors. The purpose of the third (optional) step, rotation, is to find simpler and more easily interpretable factors. If a satisfactory model has been derived, the fourth step is to compute scores for each factor for each case in the data set. The factor scores can then be used in subsequent analyses.

Factor analysis attracts a lot of criticism (Chatfield and Collins, 1980). The observed correlation matrix is generally assumed to have been constructed using product moment correlations. Hence, the usual assumptions of an interval measurement, normal distributions and homogeneity of variance are needed. Against this, it is argued the technique is fairly robust. Another problem is that the different methods of extraction and rotation tend to produce different solutions. Further, although factors may be clearly identified from the analysis, it may be difficult to give them a meaningful interpretation. Despite the need for so many judgmental decisions in its use, factor analysis remains a useful exploratory tool.

## Factor Analysis

a branch of multivariate analysis embracing methods for estimating the dimensions of a set of observed variables by studying the structure of the covariance or correlation matrices.

The basic assumption underlying factor analysis is that the correlations between a large number of observable variables are determined by the existence of a smaller number of hypothetical unobservable variables, or factors. A general model for factor analysis is provided in terms of the random variables *X _{1}* . . .,

*X*, which are the observation results, by the following linear model:

_{n}Here, the random variables *f _{j}* are common factors, the random variables

*U*are factors specific to the variables

_{i}*X*and are not correlated with the

_{i}*f*, and the ε

_{j}*, are random errors. It is assumed that*

_{j}*k < n*, that the random variables

*e*are independent of each other and of the

_{j}*f*and

_{j}*U*, and that E∊

_{i}*= 0 and D∊*

_{i}*= The constant coefficients*

_{i}*a*are called loadings (weights):

_{ij}*a*is the loading of the

_{ij}*i*th variable on the

*j*th factor. The quantities

*a*,

_{ij}*b*, and are taken as unknown parameters that have to be estimated.

_{i}In the form given above, the model for factor analysis is characterized by some indeterminacy, since *n* variables are expressed in terms of *n* + *k* other variables. Equations (*), however, imply a hypothesis, regarding the covariance matrix, that can be tested. For example, if the factors *f _{j}* are uncorrelated, D

*f*= 1,

_{i}*B*= 0, and

_{i}*c*are the elements of the matrix of covariances between the

_{ij}*X*, then there follows from equation (*) an expression for the

_{i}*c*in terms of the loadings and the variances of the errors:

_{ij}The general model for factor analysis is thus equivalent to a hypothesis regarding the covariance matrix: the covariance matrix can be represented as the sum of the matrix *A A’* and the diagonal matrix with elements , where

*A* = {*a _{ij}*}

The estimation procedure in factor analysis consists of two steps. First, the factor structure (that is, the number of factors required to account for the correlations between the *X _{i}*) is determined, and the loadings are estimated. Second, the factors are estimated on the basis of the observation results. The fundamental obstacle to the interpretation of the set of factors is that for

*k*> 1 neither the loadings nor the factors can be determined uniquely, since the factors

*f*in equations (*) can be replaced by means of any orthogonal transformation. This property of the model is made use of to transform (rotate) the factors; the transformation is chosen so that the observed variables have the maximum possible loadings on one factor and minimum possible loadings on the remaining factors.

_{j}Various practical methods are known for estimating loadings. The methods assume that *X _{1}*,. . .,

*X*obey a multivariate normal distribution with covariance matrix

_{n}*C*= {

*c*}. The maximum likelihood method is noteworthy. It leads to a unique set of estimates of the

_{ij}*c*, but for the estimates of the

_{ij}*a*it yields equations that are satisfied by an infinite set of solutions with equally good statistical properties.

_{ij}Factor analysis is regarded as dating from 1904. Although it was originally developed for problems in psychology, the range of its applications is much broader, and it is now used to solve various practical problems in such fields as medicine, economics, and chemistry. A rigorous theoretical grounding, however, has not yet been provided for many results and methods of factor analysis that are widely used in practice. The mathematical description of modern factor analysis in a rigorous manner is an extremely difficult task and remains uncompleted.

### REFERENCES

Lawley, D., and A. Maxwell.*Faktornyi analiz kak statisticheskii metod*. Moscow, 1967. (Translated from English.)

Harman, H.

*Sovremennyi faktornyi analiz*. Moscow, 1972. (Translated from English.)

A. V. PROKHOROV