# statistics and statistical analysis

## statistics and statistical analysis

the assembly and mathematical analysis of numerical data (e.g. CENSUS or SOCIAL SURVEY data). This involves description and inference.*Descriptive statistics*involve:

- organizing the data using MEASURES OF CENTRAL TENDENCY and MEASURES OF DISPERSION, graphical representations (e.g. HISTOGRAMS and PIE-GRAPHS), and in its more sophisticated forms includes;
- the use of measures of the ‘association’ between variables (e.g. CORRELATION and REGRESSION);
*Inferential statistics*uses PROBABILITY theory and random sampling (see RANDOM SAMPLE) to permit inferences to be made from a sample to a larger population (see SIGNIFICANCE TESTS). The fundamental idea involved in statistical analysis of this type is that repeatable phenomena (e.g. tossing a coin) can be assumed to conform to an underlying probabilistic model. Thus an*estimation*is a ‘best guess’ about the features of a population derived from inferences made from a sample and to which is attached a level of probability that it is correct.

Modern statistical analysis has its roots in the work of 18th-century theorists such as Laplace, Poisson, and Gauss, and in the work of early 19th-century social statisticians such as QUETELET. However, the modern discipline stems especially from the work of Francis Galton (1822-1911) who formulated the concept of the NORMAL DISTRIBUTION and also popularized the ‘correlation coefficient’. Karl Pearson (1859-1936), a student of Galton's, added notions of ‘goodness of fit’ (see CHI SQUARE) and W. S. Gossett (1876-1937) developed NONPARAMETRIC STATISTICS for situations where ratio or interval levels of measurement (see also CRITERIA AND LEVELS OF MEASUREMENT) cannot be assumed for small samples. Significance tests were added to the armoury of techniques by Ronald Fisher (1890-1962).

An important advance in recent decades has been the advent of high-speed and now widely available computer technology which has removed much of the hard grind previously associated with the use of statistics (see STATISTICAL PACKAGE FOR THE SOCIAL SCIENCES (SPSS) and MINITAID. However, while there are many advantages of this development, one disadvantage is that it has sometimes encouraged the use of statistical techniques which are only half understood, thus leading to unwarranted inferences.

While statistical analysis is well established and has become an important adjunct to many disciplines, including most of the social sciences, it has been subjected to a number of criticisms, especially Selvin (1958) that the requirements for satisfactory use of significance tests are rarely met in the social sciences. There also exist notable divisions of view within the discipline of statistics, e.g. the division between orthodox and Bayesian statistics (see BAYES’ THEOREM).

Compare MATHEMATICAL SOCIOLOGY.