Biographies Characteristics Analysis

Analysis of variance refers to a number. The hypothesis to be tested is that there is no difference between the groups.

The methods of verification discussed above statistical hypotheses about the significance of differences between the two averages in practice are of limited use. This is due to the fact that in order to identify the effect of all possible conditions and factors on the resultant trait, field and laboratory experiments, as a rule, are carried out using not two, but a larger number of samples (1220 or more).

Often, researchers compare the means of several samples combined into a single complex. For example, when studying the effect of various types and doses of fertilizers on crop yields, experiments are repeated in different options. In these cases, pairwise comparisons become cumbersome, and the statistical analysis of the entire complex requires the use of a special method. This method, developed in mathematical statistics, is called analysis of variance. It was first used by the English statistician R. Fisher when processing the results of agronomic experiments (1938).

Analysis of variance is a method statistical evaluation the reliability of the manifestation of the dependence of the effective feature on one or more factors. Using the method of analysis of variance, statistical hypotheses are tested regarding the averages in several general populations that have normal distribution.

Analysis of variance is one of the main methods of statistical evaluation of the results of an experiment. It is also increasingly used in the analysis of economic information. Analysis of variance makes it possible to establish how selective indicators of the relationship between the effective and factor signs are sufficient to disseminate the data obtained from the sample to the general population. The advantage of this method is that it gives fairly reliable conclusions from small samples.

By examining the variation of the resulting attribute under the influence of one or more factors, using analysis of variance, one can obtain, in addition to general estimates of the significance of dependencies, also an assessment of the differences in the average values ​​that are formed at different levels of factors, and the significance of the interaction of factors. Analysis of variance is used to study the dependences of both quantitative and qualitative features, as well as their combination.

The essence of this method is statistical study the probability of the influence of one or more factors, as well as their interaction on the effective feature. Accordingly, with the help of dispersion analysis, three main tasks are solved: 1) overall score significance of differences between group means; 2) assessment of the probability of interaction of factors; 3) assessment of the significance of differences between pairs of means. Most often, researchers have to solve such problems when conducting field and zootechnical experiments, when the influence of several factors on the resulting trait is studied.

The principle scheme of dispersion analysis includes the establishment of the main sources of variation of the resultant attribute and the determination of the volume of variation (sums of squared deviations) according to the sources of its formation; determination of the number of degrees of freedom corresponding to the components general variation; calculation of variances as the ratio of the corresponding volumes of variation to their number of degrees of freedom; analysis of the relationship between dispersions; assessment of the reliability of the difference between the averages and the formulation of conclusions.

This scheme is preserved both in simple ANOVA models, when data are grouped according to one attribute, and in complex models, when data are grouped according to two or more attributes. However, with an increase in the number of group characteristics, the process of decomposition of the general variation according to the sources of its formation becomes more complicated.

According to the circuit diagram analysis of variance can be presented in the form of five successive stages:

1) definition and decomposition of variation;

2) determination of the number of degrees of freedom of variation;

3) calculation of dispersions and their ratios;

4) analysis of dispersions and their ratios;

5) assessment of the reliability of the difference between the means and the formulation of conclusions on testing the null hypothesis.

The most time-consuming part of the analysis of variance is the first stage - the definition and decomposition of the variation by the sources of its formation. The order of expansion of the total volume of variation was discussed in detail in Chapter 5.

The basis for solving the problems of variance analysis is the law of expansion (addition) of variation, according to which the total variation (fluctuations) of the resulting attribute is divided into two: the variation due to the action of the studied factor (factors), and the variation caused by the action of random causes, that is

Let us assume that the population under study is divided according to a factor attribute into several groups, each of which is characterized by its own average effective sign. At the same time, the variation of these values ​​can be explained by two types of reasons: those that systematically act on the effective feature and are amenable to adjustment in the course of the experiment and are not amenable to adjustment. It is obvious that intergroup (factorial or systematic) variation depends mainly on the action of the studied factor, and intragroup (residual or random) - on the action of random factors.

To assess the significance of differences between group means, it is necessary to determine the intergroup and intragroup variations. If the intergroup (factorial) variation significantly exceeds the intragroup (residual) variation, then the factor influenced the resulting trait, significantly changing the values ​​of the group averages. But the question arises, what is the ratio between the intergroup and intragroup variations can be considered as sufficient for the conclusion about the reliability (significance) of differences between the group means.

To assess the significance of differences between the means and formulate conclusions on testing the null hypothesis (H0: x1 = x2 = ... = xn), the analysis of variance uses a kind of standard - the G-criterion, the distribution law of which was established by R. Fisher. This criterion is the ratio of two variances: factorial, generated by the action of the factor under study, and residual, due to the action of random causes:

Dispersion ratio r = t>u : £ * 2 by the American statistician Snedecor proposed to be denoted by the letter G in honor of the inventor of the analysis of variance R. Fisher.

The variances of °2 and io2 are estimates of the variance population. If samples with variances of °2 °2 are drawn from the same general population, where the variation in values ​​had random character, then the discrepancy in the values ​​of °2 °2 is also random.

If the experiment checks the influence of several factors (A, B, C, etc.) on the effective feature at the same time, then the dispersion due to the action of each of them should be comparable to °e.gP, i.e

If the value of the factor variance is significantly greater than the residual, then the factor significantly influenced the resulting attribute and vice versa.

In multifactorial experiments, in addition to the variation due to the action of each factor, there is almost always a variation due to the interaction of factors ($av: ^ls ^ss $liіs). The essence of the interaction is that the effect of one factor significantly changes to different levels the second (for example, the effectiveness of soil quality at different doses of fertilizers).

The interaction of factors should also be assessed by comparing the respective variances 3 ^w.gr:

When calculating the actual value of the B-criterion, the largest of the variances is taken in the numerator, therefore B > 1. It is obvious that the larger the B-criterion, the greater the differences between the variances. If B = 1, then the question of assessing the significance of differences in variances is removed.

To determine the limits of random fluctuations, the ratio of variances G. Fisher developed special tables of the B-distribution (Appendix 4 and 5). Criterion B is functionally related to probability and depends on the number of degrees of freedom of variation k1 and k2 of the two compared variances. Two tables are usually used to draw conclusions about the limit high value criterion for significance levels of 0.05 and 0.01. A significance level of 0.05 (or 5%) means that only in 5 cases out of 100 criterion B can take on a value equal to or higher than that indicated in the table. A decrease in the significance level from 0.05 to 0.01 leads to an increase in the value of the criterion B between two variances due to the action of only random causes.

The value of the criterion also depends directly on the number of degrees of freedom of the two compared dispersions. If the number of degrees of freedom tends to infinity (k-me), then the ratio of would for two dispersions tends to unity.

The tabular value of criterion B shows the possible random variable ratios of two variances at a given level of significance and the corresponding number of degrees of freedom for each of the compared variances. In these tables, the value of B is given for samples made from the same general population, where the reasons for the change in values ​​are only random.

The value of G is found from the tables (Appendix 4 and 5) at the intersection of the corresponding column (the number of degrees of freedom for a larger dispersion - k1) and the row (the number of degrees of freedom for a smaller dispersion - k2). So, if the larger variance (numerator G) k1 = 4, and the smaller one (denominator G) k2 = 9, then Ga at a significance level a = 0.05 will be 3.63 (app. 4). So, as a result of the action of random causes, since the samples are small, the variance of one sample can, at a 5% significance level, exceed the variance for the second sample by 3.63 times. With a decrease in the significance level from 0.05 to 0.01, the tabular value of the criterion D, as noted above, will increase. So, with the same degrees of freedom k1 = 4 and k2 = 9 and a = 0.01, the tabular value of the criterion G will be 6.99 (app. 5).

Consider the procedure for determining the number of degrees of freedom in the analysis of variance. The number of degrees of freedom, which corresponds to the total sum of squared deviations, is decomposed into the corresponding components similarly to the decomposition of the sums of squared deviations total number degrees of freedom (k") is decomposed into the number of degrees of freedom for intergroup (k1) and intragroup (k2) variations.

Thus, if a sample population consisting of N observations divided by t groups (number of experiment options) and P subgroups (number of repetitions), then the number of degrees of freedom k, respectively, will be:

a) for the total sum of squared deviations (dszar)

b) for the intergroup sum of squared deviations ^m.gP)

c) for the intragroup sum of squared deviations in w.gr)

According to the addition rule of variation:

For example, if four variants of the experiment were formed in the experiment (m = 4) in five repetitions each (n = 5), and total observations N = = t o p \u003d 4 * 5 \u003d 20, then the number of degrees of freedom, respectively, is equal to:

Knowing the sums of squared deviations of the number of degrees of freedom, it is possible to determine unbiased (adjusted) estimates for three variances:

The null hypothesis H0 by criterion B is tested in the same way as by Student's u-test. To make a decision on checking H0, it is necessary to calculate the actual value of the criterion and compare it with table value Ba for the accepted level of significance a and the number of degrees of freedom k1 and k2 for two dispersions.

If Bfakg > Ba, then, in accordance with the accepted level of significance, we can conclude that the differences in sample variances are determined not only by random factors; they are significant. In this case, the null hypothesis is rejected and there is reason to believe that the factor significantly affects the resulting attribute. If< Ба, то нулевую гипотезу принимают и есть основание утверждать, что различия между сравниваемыми дисперсиями находятся в границах возможных случайных колебаний: действие фактора на результативный признак не является существенным.

The use of one or another ANOVA model depends both on the number of factors studied and on the method of sampling.

Depending on the number of factors that determine the variation of the effective feature, samples can be formed by one, two or more factors. According to this analysis of variance is divided into single-factor and multi-factor. Otherwise, it is also called a single-factor and multi-factor dispersion complex.

The scheme of decomposition of the general variation depends on the formation of the groups. It can be random (observations of one group are not related to the observations of the second group) and non-random (observations of two samples are interconnected by the common conditions of the experiment). Accordingly, independent and dependent samples are obtained. Independent samples can be formed with both equal and uneven numbers. The formation of dependent samples assumes their equal number.

If the groups are formed in a non-violent order, then the total amount of variation of the resulting trait includes, along with the factorial (intergroup) and residual variation, the variation of repetitions, that is

In practice, in most cases it is necessary to consider dependent samples when the conditions for groups and subgroups are equalized. Yes, in field experience the entire site is divided into blocks, with the most virivnyanniya conditions. At the same time, each variant of the experiment gets equal opportunities to be represented in all blocks, which achieves equalization of conditions for all tested options, experience. This method of constructing experience is called the method of randomized blocks. Experiments with animals are carried out similarly.

When processing socio-economic data by the method of dispersion analysis, it must be borne in mind that, due to the rich number of factors and their interrelation, it is difficult, even with the most careful alignment of conditions, to establish the degree of objective influence of each individual factor on the effective attribute. Therefore, the level of residual variation is determined not only by random causes, but also by significant factors that were not taken into account when building the ANOVA model. As a result, the residual dispersion as a basis for comparison sometimes becomes inadequate for its purpose, it is clearly overestimated in magnitude and cannot act as a criterion for the significance of the influence of factors. In this regard, when building models of variance analysis, the problem of selection becomes relevant. critical factors and leveling the conditions for the manifestation of the action of each of them. Besides. the use of analysis of variance assumes a normal or close to normal distribution of the statistical populations under study. If this condition is not met, then the estimates obtained in the analysis of variance will be exaggerated.

Analysis of variance(from the Latin Dispersio - dispersion / in English Analysis Of Variance - ANOVA) is used to study the influence of one or more qualitative variables (factors) on one dependent quantitative variable (response).

The analysis of variance is based on the assumption that some variables can be considered as causes (factors, independent variables): , and others as consequences (dependent variables). Independent variables are sometimes called adjustable factors precisely because in the experiment the researcher has the opportunity to vary them and analyze the resulting result.

main goal analysis of variance(ANOVA) is the study of the significance of differences between means by comparing (analyzing) the variances. Separation total variance to multiple sources, allows you to compare the variance caused by the difference between groups with the variance caused by within-group variability. If the null hypothesis is true (about the equality of means in several groups of observations selected from the general population), the estimate of the variance associated with intragroup variability should be close to the estimate of intergroup variance. If you are simply comparing the means of two samples, the analysis of variance will give the same result as a regular independent sample t-test (if you are comparing two independent groups of objects or observations) or a dependent-sample t-test (if you are comparing two variables on the same and the same set of objects or observations).

The essence of analysis of variance lies in the division of the total variance of the studied trait into separate components, due to the influence of specific factors, and testing hypotheses about the significance of the influence of these factors on the studied trait. By comparing the components of the dispersion with each other using the Fisher F-test, it is possible to determine what proportion of the total variability of the resulting trait is due to the action of adjustable factors.

The source material for analysis of variance is the data of the study of three or more samples: , which can be either equal or unequal in number, both connected and disconnected. According to the number of identified adjustable factors, analysis of variance can be one-factor(at the same time, the influence of one factor on the results of the experiment is studied), two-factor(when studying the influence of two factors) and multifactorial(allows you to evaluate not only the influence of each of the factors separately, but also their interaction).

Analysis of variance belongs to the group of parametric methods and therefore it should be used only when it is proved that the distribution is normal.

Analysis of variance is used if the dependent variable is measured on a scale of ratios, intervals, or order, and the influencing variables are non-numeric (name scale).

Task examples

In problems that are solved by analysis of variance, there is a response of a numerical nature, which is affected by several variables that have a nominal nature. For example, several types of livestock fattening rations or two ways of keeping them, etc.

Example 1: During the week, several pharmacy kiosks operated in three different locations. In the future, we can leave only one. It is necessary to determine whether there is a statistically significant difference between the sales volumes of drugs in kiosks. If yes, we will select the kiosk with the highest average daily sales volume. If the difference in sales volume turns out to be statistically insignificant, then other indicators should be the basis for choosing a kiosk.

Example 2: Comparison of contrasts of group means. The seven political affiliations are ordered from extremely liberal to extremely conservative, and linear contrast is used to test whether there is a non-zero upward trend in group means—i.e., whether there is a significant linear increase in mean age when considering groups ordered in the direction from liberal to conservative.

Example 3: Two-way analysis of variance. The number of product sales, in addition to the size of the store, is often affected by the location of the shelves with the product. This example contains weekly sales figures characterized by four shelf layouts and three store sizes. The results of the analysis show that both factors - the location of the shelves with the goods and the size of the store - affect the number of sales, but their interaction is not significant.

Example 4: Univariate ANOVA: Randomized two-treatment full block design. The influence on the baking of bread of all possible combinations three fats and three dough rippers. Four flour samples taken from four different sources, served as blocking factors. The significance of the fat-ripper interaction needs to be identified. After that, to determine the various options for choosing contrasts, allowing you to find out which combinations of levels of factors differ.

Example 5: Model of a hierarchical (nested) plan with mixed effects. The influence of four randomly selected heads mounted in a machine tool on the deformation of manufactured glass cathode holders is studied. (The heads are built into the machine, so the same head cannot be used on different machines.) The head effect is treated as a random factor. ANOVA statistics show that there are no significant differences between machines, but there are indications that heads may differ. The difference between all the machines is not significant, but for two of them the difference between the types of heads is significant.

Example 6: Univariate repeated measurements analysis using a split-plot plan. This experiment was conducted to determine the effect of an individual's anxiety rating on exam performance on four consecutive attempts. The data are organized so that they can be considered as groups of subsets of the entire data set ("the whole plot"). The effect of anxiety was not significant, while the effect of trying was significant.

List of methods

  • Models of factorial experiment. Examples: factors affecting the success of solving mathematical problems; factors influencing sales volumes.

The data consist of several series of observations (processings), which are considered as realizations of independent samples. The initial hypothesis is that there is no difference in treatments, i.e. it is assumed that all observations can be considered as one sample from the total population:

  • One - factor parametric model : Scheffe 's method .
  • One-factor non-parametric model [Lagutin M.B., 237]: Kruskal-Wallis criterion [Hollender M., Wolf D.A., 131], Jonkheer's criterion [Lagutin M.B., 245].
  • General case of a model with constant factors, Cochran's theorem [Afifi A., Eisen S., 234].

The data are two-fold repeated observations:

  • Two-factor non-parametric model: Friedman's criterion [Lapach, 203], Page's criterion [Lagutin M.B., 263]. Examples: comparison of the effectiveness of production methods, agricultural practices.
  • Two-factor nonparametric model for incomplete data

Story

Where did the name come from analysis of variance? It may seem strange that the procedure for comparing means is called analysis of variance. In fact, this is due to the fact that when examining the statistical significance of the difference between the means of two (or several) groups, we are actually comparing (analyzing) the sample variances. The fundamental concept of analysis of variance is proposed Fisher in 1920. Perhaps a more natural term would be sum of squares analysis or analysis of variation, but due to tradition, the term analysis of variance is used. Initially, analysis of variance was developed to process data obtained in the course of specially designed experiments, and was considered the only method that correctly explores causal relationships. The method was used to evaluate experiments in crop production. Later, the general scientific significance of dispersion analysis for experiments in psychology, pedagogy, medicine, etc., became clear.

Literature

  1. Sheff G. Dispersion analysis. - M., 1980.
  2. Ahrens H. Leiter Yu. Multivariate analysis of variance.
  3. Kobzar A.I. Applied mathematical statistics. - M.: Fizmatlit, 2006.
  4. Lapach S. N., Chubenko A. V., Babich P. N. Statistics in science and business. - Kyiv: Morion, 2002.
  5. Lagutin M. B. Visual mathematical statistics. In two volumes. - M.: P-center, 2003.
  6. Afifi A., Eisen S. Statistical analysis: The computer-assisted approach.
  7. Hollender M., Wolf D.A. Nonparametric methods of statistics.

Links

The mind is not only in knowledge, but also in the ability to apply knowledge in practice. (Aristotle)

Analysis of variance

Introductory overview

In this section, we will review the basic methods, assumptions, and terminology of ANOVA.

Note that in the English literature, analysis of variance is usually called the analysis of variation. Therefore, for brevity, below we will sometimes use the term ANOVA (An alysis o f va riation) for conventional ANOVA and the term MANOVA for multivariate analysis of variance. In this section, we will sequentially consider the main ideas of the analysis of variance ( ANOVA), analysis of covariance ( ANCOVA), multivariate analysis of variance ( MANOVA) and multivariate covariance analysis ( MANCOVA). After a brief discussion of the merits of contrast analysis and post hoc criteria Let us consider the assumptions on which the methods of analysis of variance are based. Towards the end of this section, the advantages of the multivariate approach for repeated measures analysis are explained over the traditional one-dimensional approach.

Key Ideas

The purpose of the analysis of variance. The main purpose of the analysis of variance is to study the significance of the difference between the means. Chapter (Chapter 8) provides a brief introduction to statistical significance testing. If you are just comparing the means of two samples, analysis of variance will give the same result as normal analysis. t- criterion for independent samples (if two independent groups of objects or observations are compared), or t- criterion for dependent samples (if two variables are compared on the same set of objects or observations). If you are not familiar with these criteria, we recommend that you refer to the introductory overview of the chapter (Chapter 9).

Where did the name come from Analysis of variance? It may seem strange that the procedure for comparing means is called analysis of variance. In fact, this is due to the fact that when we examine the statistical significance of the difference between the means, we are actually analyzing the variances.

Splitting the sum of squares

For sample size n sample variance calculated as the sum of squared deviations from the sample mean divided by n-1 (sample size minus one). Thus, for a fixed sample size n, the variance is a function of the sum of squares (deviations), denoted, for brevity, SS(from English Sum of Squares - Sum of Squares). The analysis of variance is based on the division (or splitting) of the variance into parts. Consider the following dataset:

The means of the two groups are significantly different (2 and 6, respectively). Sum of squared deviations inside of each group is 2. Adding them together, we get 4. If we now repeat these calculations excluding group membership, that is, if we calculate SS based on the combined mean of the two samples, we get 28. In other words, the variance (sum of squares) based on within-group variability results in much smaller values ​​than when calculated based on total variability (relative to the overall mean). The reason for this is obviously the significant difference between the averages, and this difference between the averages explains existing difference between sums of squares. Indeed, if we use the module Analysis of variance, the following results will be obtained:

As can be seen from the table, the total sum of squares SS=28 divided into the sum of squares due to intragroup variability ( 2+2=4 ; see second row of the table) and the sum of squares due to the difference in the mean values. (28-(2+2)=24; see the first line of the table).

SS mistakes andSS effect. Intragroup variability ( SS) is usually called the variance errors. This means that it usually cannot be predicted or explained when an experiment is carried out. On the other side, SS effect(or intergroup variability) can be explained by the difference between the means in the studied groups. In other words, belonging to a certain group explains intergroup variability, because we know that these groups have different means.

Significance check. The main ideas of testing for statistical significance are discussed in the chapter Elementary concepts of statistics(Chapter 8). The same chapter explains the reasons why many tests use the ratio of explained and unexplained variance. An example of this use is the analysis of variance itself. Significance testing in ANOVA is based on comparing the variance due to between-group variation (called mean square effect or MSEffect) and dispersion due to within-group spread (called mean square error or MSmistake). If the null hypothesis is true (equality of the means in the two populations), then we can expect a relatively small difference in the sample means due to random variability. Therefore, under the null hypothesis, the intra-group variance will practically coincide with the total variance calculated without taking into account the group membership. The resulting within-group variances can be compared using F- test that checks whether the ratio of variances is significantly greater than 1. In the above example, F- The test shows that the difference between the means is statistically significant.

Basic logic of ANOVA. Summing up, we can say that the purpose of analysis of variance is to test the statistical significance of the difference between the means (for groups or variables). This check is carried out using analysis of variance, i.e. by splitting the total variance (variation) into parts, one of which is due to a random error (that is, intragroup variability), and the second is associated with the difference in the average values. The last component of the variance is then used to analyze the statistical significance of the difference between the means. If this difference is significant, the null hypothesis is rejected and the alternative hypothesis that there is a difference between the means is accepted.

Dependent and independent variables. Variables whose values ​​are determined by measurements during an experiment (for example, a score scored on a test) are called dependent variables. Variables that can be manipulated in an experiment (for example, training methods or other criteria that allow you to divide observations into groups) are called factors or independent variables. These concepts are described in more detail in the chapter Elementary concepts of statistics(Chapter 8).

Multivariate analysis of variance

In the simple example above, you could immediately calculate the independent sample t-test using the appropriate module option Basic statistics and tables. The results obtained, of course, coincide with the results of the analysis of variance. However, analysis of variance contains flexible and powerful technical means, which can be used for much more complex studies.

Lots of factors. The world is inherently complex and multidimensional. Situations where some phenomenon is completely described by one variable are extremely rare. For example, if we are trying to learn how to grow large tomatoes, we should consider factors related to the genetic structure of plants, soil type, light, temperature, etc. Thus, when conducting a typical experiment, you have to deal with a large number of factors. The main reason why the use of ANOVA is preferable to the repeated comparison of two samples at different levels of factors using t- criterion is that the analysis of variance is more effective and, for small samples, more informative.

Factor management. Let's assume that in the example of two-sample analysis discussed above, we add one more factor, for example, Floor- Gender. Let each group consist of 3 men and 3 women. The design of this experiment can be presented in the form of a 2 by 2 table:

Experiment. Group 1 Experiment. Group 2
Men2 6
3 7
1 5
The average2 6
Women4 8
5 9
3 7
The average4 8

Before doing the calculations, you can see that in this example the total variance has at least three sources:

(1) random error (within group variance),

(2) variability associated with membership in the experimental group, and

(3) variability due to the gender of the observed objects.

(Note that there is another possible source of variability - interaction of factors, which we will discuss later). What happens if we don't include floorgender as a factor in the analysis and calculate the usual t-criterion? If we calculate sums of squares, ignoring floor -gender(i.e., combining objects of different sexes into one group when calculating the within-group variance, while obtaining the sum of squares for each group equal to SS=10, and the total sum of squares SS= 10+10 = 20), then we get greater value within-group variance than with a more accurate analysis with additional subgrouping by semi- gender(in this case, the intragroup means will be equal to 2, and the total intragroup sum of squares will be equal to SS = 2+2+2+2 = 8). This difference is due to the fact that the mean value for men - males less than the average for women -female, and this difference in means increases the total within-group variability if sex is not taken into account. Controlling the error variance increases the sensitivity (power) of the test.

This example shows another advantage of analysis of variance over conventional analysis. t-criterion for two samples. Analysis of variance allows you to study each factor by controlling the values ​​of other factors. This is, in fact, the main reason for its greater statistical power (smaller sample sizes are required to obtain meaningful results). For this reason, analysis of variance, even on small samples, gives statistically more significant results than a simple one. t- criterion.

Interaction effects

There is another advantage of using ANOVA over conventional analysis. t- criterion: analysis of variance allows you to detect interaction between the factors and therefore allows more complex models to be studied. To illustrate, consider another example.

Main effects, pairwise (two-factor) interactions. Let us assume that there are two groups of students, and psychologically the students of the first group are tuned in to the fulfillment of the assigned tasks and are more purposeful than the students of the second group, which consists of lazier students. Let's divide each group randomly in half and offer one half of each group a difficult task, and the other an easy one. After that, we measure how hard students work on these tasks. The averages for this (fictitious) study are shown in the table:

What conclusion can be drawn from these results? Is it possible to conclude that: (1) students work harder on a difficult task; (2) do motivated students work harder than lazy ones? None of these statements reflect the essence of the systematic nature of the averages given in the table. Analyzing the results, it would be more correct to say that only motivated students work harder on complex tasks, while only lazy students work harder on easy tasks. In other words, the nature of the students and the complexity of the task interacting each other affect the amount of effort required. That's an example pair interaction between the nature of students and the complexity of the task. Note that statements 1 and 2 describe main effects.

Interactions of higher orders. While pairwise interactions are relatively easy to explain, higher-order interactions are much more difficult to explain. Let us imagine that in the example considered above, one more factor is introduced floor -Gender and we got the following table of averages:

What conclusions can now be drawn from the results obtained? Mean plots make it easy to interpret complex effects. The analysis of variance module allows you to build these graphs with almost one click.

The image in the graphs below represents the three-way interaction under study.

Looking at the graphs, we can tell that there is an interaction between the nature and difficulty of the test for women: motivated women work harder on a difficult task than on an easy one. In men, the same interaction is reversed. It can be seen that the description of the interaction between factors becomes more confusing.

General way of describing interactions. AT general case the interaction between factors is described as a change in one effect under the influence of another. In the example discussed above, two-factor interaction can be described as a change in the main effect of the factor characterizing the complexity of the task, under the influence of the factor describing the character of the student. For the interaction of the three factors from the previous paragraph, we can say that the interaction of two factors (the complexity of the task and the character of the student) changes under the influence of genderGender. If the interaction of four factors is studied, we can say that the interaction of three factors changes under the influence of the fourth factor, i.e. there are different types of interactions at different levels of the fourth factor. It turned out that in many areas the interaction of five or even more factors is not unusual.

Complex plans

Intergroup and intragroup plans (remeasurement plans)

When comparing two various groups commonly used t- criterion for independent samples (from module Basic statistics and tables). When two variables are compared on the same set of objects (observations), it is used t-criterion for dependent samples. For analysis of variance, it is also important whether or not the samples are dependent. If there are repeated measurements of the same variables (under different conditions or in different time) for the same objects, then they say about the presence repeated measurements factor(also called an intragroup factor since the within-group sum of squares is calculated to evaluate its significance). If different groups of objects are compared (for example, men and women, three strains of bacteria, etc.), then the difference between the groups is described intergroup factor. The methods for calculating the significance criteria for the two types of factors described are different, but their general logic and interpretation are the same.

Inter- and intra-group plans. In many cases, the experiment requires the inclusion of both an between-group factor and a repeated measures factor in the design. For example, the math skills of female and male students are measured (where floor -Gender-intergroup factor) at the beginning and at the end of the semester. The two dimensions of each student's skills form the within-group factor (repeated measures factor). The interpretation of the main effects and interactions for between-group and repeated measures factors is the same, and both types of factors can obviously interact with each other (for example, women gain skills during the semester, and men lose them).

Incomplete (nested) plans

In many cases, the interaction effect can be neglected. This occurs either when it is known that there is no interaction effect in the population, or when the implementation of the full factorial plan is impossible. For example, the effect of four fuel additives on fuel consumption is being studied. Four cars and four drivers are selected. Full factorial the experiment requires that each combination: supplement, driver, car, appear at least once. This requires at least 4 x 4 x 4 = 64 test groups, which is too time consuming. In addition, there is hardly any interaction between the driver and the fuel additive. With this in mind, you can use the plan latin squares, which contains only 16 groups of tests (four additives are designated by the letters A, B, C and D):

Latin squares are described in most experimental design books (eg Hays, 1988; Lindman, 1974; Milliken and Johnson, 1984; Winer, 1962) and will not be discussed in detail here. Note that Latin squares are notnfull plans that do not include all combinations of factor levels. For example, driver 1 drives car 1 with additive A only, driver 3 drives car 1 with additive C only. Factor levels additives ( A, B, C and D) nested in table cells automobile x driver - like eggs in a nest. This mnemonic rule is useful for understanding the nature nested or nested plans. Module Analysis of variance provides simple ways to analyze plans of this type.

Covariance analysis

Main idea

In chapter Key Ideas the idea of ​​controlling factors was briefly discussed and how the inclusion of additive factors can reduce the sum of squared errors and increase the statistical power of the design. All this can be extended to variables with a continuous set of values. When such continuous variables are included as factors in the design, they are called covariates.

Fixed covariates

Suppose that we are comparing the mathematical skills of two groups of students who were taught from two different textbooks. Let's also assume that we have intelligence quotient (IQ) data for each student. We can assume that IQ is related to math skills and use this information. For each of the two groups of students, the correlation coefficient between IQ and math skills can be calculated. Using this correlation coefficient, it is possible to distinguish between the share of variance in groups explained by the influence of IQ and the unexplained share of variance (see also Elementary concepts of statistics(chapter 8) and Basic statistics and tables(Chapter 9)). The remaining fraction of the variance is used in the analysis as the error variance. If there is a correlation between IQ and math skills, then the error variances can be significantly reduced. SS/(n-1) .

Effect of covariates onF- criterion. F- the criterion evaluates the statistical significance of the difference between the mean values ​​in the groups, while the ratio of the intergroup variance is calculated ( MSeffect) to the error variance ( MSerror) . If a MSerror decreases, for example, when taking into account the IQ factor, the value F increases.

Lots of covariates. The reasoning used above for a single covariate (IQ) easily extends to multiple covariates. For example, in addition to IQ, you can include the measurement of motivation, spatial thinking, etc. Instead of the usual correlation coefficient, it uses multiple factor correlations.

When the valueF -criteria decreases. Sometimes the introduction of covariates in the design of the experiment reduces the value F- criteria . This usually indicates that the covariates are correlated not only with the dependent variable (such as math skills), but also with factors (such as different textbooks). Suppose IQ is measured at the end of the semester, after almost annual training two groups of students on two different textbooks. Although students were divided into groups randomly, it may turn out that the difference in textbooks is so great that both IQ and math skills in different groups will vary greatly. In this case, the covariates not only reduce the error variance, but also the between-group variance. In other words, after controlling for the difference in IQ between groups, the difference in math skills will no longer be significant. It can be said otherwise. After “eliminating” the influence of IQ, the influence of the textbook on the development of mathematical skills is inadvertently excluded.

Adjusted averages. When the covariate affects the between-group factor, one should calculate adjusted averages, i.e. such means, which are obtained after removing all estimates of the covariates.

Interaction between covariates and factors. Just as interactions between factors are explored, interactions between covariates and between groups of factors can be explored. Suppose one of the textbooks is especially suitable for smart students. The second textbook is boring for smart students, and the same textbook is difficult for less smart students. As a result, there is a positive correlation between IQ and learning outcomes in the first group (smarter students, better result) and zero or slight negative correlation in the second group (the smarter the student, the less likely it is to acquire mathematical skills from the second textbook). In some studies, this situation is discussed as an example of violation of the assumptions of the analysis of covariance. However, since the Analysis of Variance module uses the most common methods of analysis of covariance, it is possible, in particular, to assess the statistical significance of the interaction between factors and covariates.

Variable covariates

While fixed covariates are discussed quite often in textbooks, variable covariates are much less frequently mentioned. Usually, when conducting experiments with repeated measurements, we are interested in differences in the measurements of the same quantities at different points in time. Namely, we are interested in the significance of these differences. If a covariate measurement is carried out at the same time as the dependent variable measurements, the correlation between the covariate and the dependent variable can be calculated.

For example, you can study interest in mathematics and math skills at the beginning and at the end of the semester. It would be interesting to check whether changes in interest in mathematics are correlated with changes in mathematical skills.

Module Analysis of variance in STATISTICS automatically assesses the statistical significance of changes in covariates in those plans, where possible.

Multivariate Designs: Multivariate ANOVA and Covariance Analysis

Intergroup plans

All examples considered earlier included only one dependent variable. When there are several dependent variables at the same time, only the complexity of the calculations increases, and the content and basic principles do not change.

For example, a study is being conducted on two different textbooks. At the same time, the success of students in the study of physics and mathematics is studied. In this case, there are two dependent variables and you need to find out how two different textbooks affect them simultaneously. To do this, you can use multivariate analysis of variance (MANOVA). Instead of a one-dimensional F criterion, multidimensional F test (Wilks l-test) based on comparison of error covariance matrix and intergroup covariance matrix.

If the dependent variables are correlated with each other, then this correlation should be taken into account when calculating the significance test. Obviously, if the same measurement is repeated twice, then nothing new can be obtained in this case. If a dimension correlated with it is added to an existing dimension, then some new information, but the new variable contains redundant information, which is reflected in the covariance between the variables.

Interpretation of results. If the overall multivariate criterion is significant, we can conclude that the corresponding effect (eg textbook type) is significant. However, they get up next questions. Does the type of textbook affect the improvement of only math skills, only physical skills, or both of them. In fact, after obtaining a meaningful multivariate criterion, for a single main effect or interaction, one-dimensional F criterion. In other words, dependent variables that contribute to the significance of the multivariate test are examined separately.

Plans with repeated measurements

If the mathematical and physical skills of students are measured at the beginning of the semester and at the end, then these are repeated measurements. The study of the criterion of significance in such plans is logical development one-dimensional case. Note that multivariate ANOVA methods are also commonly used to investigate the significance of univariate repeated measures factors that have more than two levels. The corresponding applications will be discussed later in this part.

Summation of variable values ​​and multivariate analysis of variance

Even experienced users of univariate and multivariate ANOVA often get confused when they get different results when applying multivariate ANOVA to, say, three variables, and when applying univariate ANOVA to the sum of these three variables as a single variable.

Idea summation variables is that each variable contains some true variable, which is investigated, as well as a random measurement error. Therefore, when averaging the values ​​of the variables, the measurement error will be closer to 0 for all measurements and the averaged values ​​will be more reliable. In fact, in this case, applying ANOVA to the sum of variables is reasonable and a powerful technique. However, if the dependent variables are multivariate in nature, summing the values ​​of the variables is inappropriate.

For example, let the dependent variables consist of four measures success in society. Each indicator characterizes a completely independent side human activity(for example, professional success, business success, family well-being, etc.). Adding these variables together is like adding an apple and an orange. The sum of these variables would not be a suitable univariate measure. Therefore, such data must be treated as multidimensional indicators in multivariate analysis of variance.

Contrast analysis and post hoc tests

Why are individual sets of means compared?

Usually hypotheses about experimental data are formulated not simply in terms of main effects or interactions. An example is the following hypothesis: a certain textbook improves mathematical skills only in male students, while another textbook is approximately equally effective for both sexes, but still less effective for men. It can be predicted that textbook performance interacts with student gender. However, this prediction also applies nature interactions. A significant difference between the sexes is expected for students in one book, and practically gender-independent results for students in the other book. This type of hypothesis is usually explored using contrast analysis.

Contrast Analysis

In short, contrast analysis allows us to evaluate the statistical significance of some linear combinations of complex effects. Contrast analysis is the main and indispensable element of any complex ANOVA plan. Module Analysis of variance has enough various possibilities analysis of contrasts, which allow you to highlight and analyze any type of comparison of means.

a posteriori comparisons

Sometimes, as a result of processing an experiment, an unexpected effect is discovered. Although in most cases creative researcher can explain any result, this does not provide opportunities for further analysis and obtaining estimates for the forecast. This problem is one of those for which post hoc criteria, that is, criteria that do not use a priori hypotheses. To illustrate, consider the following experiment. Suppose that 100 cards contain numbers from 1 to 10. Having dropped all these cards into the header, we randomly select 20 times 5 cards, and calculate the average value for each sample (the average of the numbers written on the cards). Can we expect that there are two samples whose means are significantly different? This is very plausible! By choosing two samples with the maximum and minimum mean, one can obtain a difference in the means that is very different from the difference in the means, for example, of the first two samples. This difference can be investigated, for example, using contrast analysis. Without going into details, there are several so-called a posteriori criteria that are based exactly on the first scenario (taking extreme averages from 20 samples), i.e. these criteria are based on choosing the most different means to compare all means in the design. These criteria are applied in order not to get an artificial effect purely by chance, for example, to find a significant difference between the means when there is none. Module Analysis of variance offers a wide range of such criteria. When unexpected results are encountered in an experiment involving multiple groups, the a posteriori procedures for examining the statistical significance of the results obtained.

Sum of squares type I, II, III and IV

Multivariate regression and analysis of variance

Exist strong relationship between the method of multivariate regression and analysis of variance (analysis of variations). In both methods, it is investigated linear model. In short, almost all experimental designs can be explored using multivariate regression. Consider the following simple cross-group 2 x 2 plan.

DV A B AxB
3 1 1 1
4 1 1 1
4 1 -1 -1
5 1 -1 -1
6 -1 1 -1
6 -1 1 -1
3 -1 -1 1
2 -1 -1 1

Columns A and B contain codes characterizing the levels of factors A and B, column AxB contains the product of two columns A and B. We can analyze these data using multivariate regression. Variable DV defined as a dependent variable, variables from A before AxB as independent variables. The study of significance for the regression coefficients will coincide with the calculations in the analysis of variance of the significance of the main effects of the factors A and B and interaction effect AxB.

Unbalanced and Balanced Plans

When calculating the correlation matrix for all variables, for example, for the data depicted above, it can be seen that the main effects of the factors A and B and interaction effect AxB uncorrelated. This property of effects is also called orthogonality. They say that the effects A and B - orthogonal or independent from each other. If all effects in the plan are orthogonal to each other, as in the example above, then the plan is said to be balanced.

Balanced plans have the “good property.” The calculations in the analysis of such plans are very simple. All calculations are reduced to calculating the correlation between effects and dependent variables. Since the effects are orthogonal, partial correlations (as in full multidimensional regressions) are not calculated. However, in real life plans are not always balanced.

Consider real data with an unequal number of observations in cells.

Factor A Factor B
B1 B2
A1 3 4, 5
A2 6, 6, 7 2

If we encode this data as above and calculate the correlation matrix for all variables, then it turns out that the design factors are correlated with each other. Factors in the plan are now not orthogonal and such plans are called unbalanced. Note that in this example, the correlation between the factors is entirely related to the difference in the frequencies of 1 and -1 in the columns of the data matrix. In other words, experimental designs with unequal cell volumes (more precisely, disproportionate volumes) will be unbalanced, which means that the main effects and interactions will mix. In this case, to calculate the statistical significance of the effects, you need to fully calculate the multivariate regression. There are several strategies here.

Sum of squares type I, II, III and IV

Sum of squares typeIandIII. To study the significance of each factor in a multivariate model, one can calculate the partial correlation of each factor, provided that all other factors are already taken into account in the model. You can also enter factors into the model in a step-by-step manner, fixing all the factors already entered into the model and ignoring all other factors. In general, this is the difference between type III and typeI sums of squares (this terminology was introduced in SAS, see for example SAS, 1982; a detailed discussion can also be found in Searle, 1987, p. 461; Woodward, Bonett, and Brecht, 1990, p. 216; or Milliken and Johnson, 1984, p. 138).

Sum of squares typeII. The next “intermediate” model formation strategy is: to control all the main effects in the study of the significance of a single main effect; in the control of all main effects and all pairwise interactions, when the significance of a single pairwise interaction is examined; in controlling all main effects of all pairwise interactions and all interactions of three factors; in the study of a separate interaction of three factors, etc. The sums of squares for effects calculated in this way are called typeII sums of squares. So, typeII sums of squares controls all effects of the same order and below, ignoring all effects of a higher order.

Sum of squares typeIV. Finally, for some special plans with missing cells (incomplete plans), it is possible to calculate the so-called type IV sums of squares. This method will be discussed later in connection with incomplete plans (plans with missing cells).

Interpretation of the sum-of-squares conjecture of types I, II, and III

sum of squares typeIII easiest to interpret. Recall that the sums of squares typeIII examine the effects after controlling for all other effects. For example, after finding a statistically significant typeIII effect for the factor A in the module Analysis of variance, we can say that there is a single significant effect of the factor A, after introducing all other effects (factors) and interpret this effect accordingly. Probably in 99% of all applications of analysis of variance, this type of criterion is of interest to the researcher. This type of sum of squares is usually computed in the module Analysis of variance by default, regardless of whether the option is selected Regression Approach or not (standard approaches adopted in the module Analysis of variance discussed below).

Significant effects obtained using sums of squares type or typeII sums of squares are not so easy to interpret. They are best interpreted in the context of stepwise multivariate regression. If using the sum of squares typeI the main effect of factor B was significant (after the inclusion of factor A in the model, but before adding the interaction between A and B), it can be concluded that there is a significant main effect of factor B, provided that there is no interaction between factors A and B. (If at using the criterion typeIII, factor B also turned out to be significant, then we can conclude that there is a significant main effect of factor B, after introducing all other factors and their interactions into the model).

In terms of the marginal means of the hypothesis typeI and typeII usually do not have a simple interpretation. In these cases, it is said that one cannot interpret the significance of the effects by considering only the marginal means. rather presented p mean values ​​are related to a complex hypothesis that combines means and sample size. For example, typeII the hypotheses for factor A in the simple 2 x 2 design example discussed earlier would be (see Woodward, Bonett, and Brecht, 1990, p. 219):

nij- number of observations in a cell

uij- average value in a cell

n. j- marginal average

Without going into details (for more details see Milliken and Johnson, 1984, chapter 10), it is clear that these are not simple hypotheses and in most cases none of them is of particular interest to the researcher. However, there are cases where the hypotheses typeI may be of interest.

The default computational approach in the module Analysis of variance

Default if option is not checked Regression Approach, module Analysis of variance uses cell average model. It is characteristic of this model that the sums of squares for different effects are calculated for linear combinations of cell means. In a full factorial experiment, this results in sums of squares that are the same as the sums of squares discussed earlier as type III. However, in the option Scheduled Comparisons(in the window Analysis of variance results), the user can hypothesize about any linear combination of weighted or unweighted cell means. Thus, the user can test not only hypotheses typeIII, but hypotheses of any type (including typeIV). This general approach especially useful when examining designs with missing cells (so-called incomplete designs).

For full factorial designs, this approach is also useful when one wants to analyze weighted marginal means. For example, suppose that in the simple 2 x 2 design considered earlier, we want to compare the weighted (in terms of factor levels) B) marginal averages for factor A. This is useful when the distribution of observations over cells was not prepared by the experimenter, but was constructed randomly, and this randomness is reflected in the distribution of the number of observations by levels of factor B in the aggregate.

For example, there is a factor - the age of widows. A possible sample of respondents is divided into two groups: younger than 40 and older than 40 (factor B). The second factor (factor A) in the plan is whether or not widows received social support from some agency (while some widows were randomly selected, others served as controls). In this case, the age distribution of widows in the sample reflects the actual age distribution of widows in the population. Assessing the effectiveness of the social support group for widows all ages will correspond to the weighted average for the two age groups (with weights corresponding to the number of observations in the group).

Scheduled Comparisons

Note that the sum of the entered contrast ratios is not necessarily equal to 0 (zero). Instead, the program will automatically make adjustments so that the corresponding hypotheses do not mix with the overall average.

To illustrate this, let's go back to the simple 2 x 2 plan discussed earlier. Recall that the cell counts of this unbalanced design are -1, 2, 3, and 1. Let's say we want to compare the weighted marginal averages for factor A (weighted by the frequency of factor B levels). You can enter contrast ratios:

Note that these coefficients do not add up to 0. The program will set the coefficients so that they add up to 0, while maintaining their relative values, i.e.:

1/3 2/3 -3/4 -1/4

These contrasts will compare the weighted averages for factor A.

Hypotheses about the principal mean. The hypothesis that the unweighted principal mean is 0 can be explored using coefficients:

The hypothesis that the weighted principal mean is 0 is tested with:

In no case does the program correct the contrast ratios.

Analysis of plans with missing cells (incomplete plans)

Factorial designs containing empty cells (processing of combinations of cells in which there are no observations) are called incomplete. In such designs, some factors are usually not orthogonal and some interactions cannot be calculated. Doesn't exist at all best method analysis of such plans.

Regression Approach

In some older programs that rely on analysis of ANOVA designs using multivariate regression, the default factors in incomplete designs are given by in the usual way(as if the plan is complete). A multivariate regression analysis is then performed for these dummy-coded factors. Unfortunately, this method leads to results that are very difficult, if not impossible, to interpret because it is not clear how each effect contributes to the linear combination of means. Consider the following simple example.

Factor A Factor B
B1 B2
A1 3 4, 5
A2 6, 6, 7 Missed

If multivariate regression of the form Dependent variable = Constant + Factor A + Factor B, then the hypothesis about the significance of factors A and B in terms of linear combinations of means looks like this:

Factor A: Cell A1,B1 = Cell A2,B1

Factor B: Cell A1,B1 = Cell A1,B2

This case is simple. In more complicated plans it is impossible to actually determine what exactly will be investigated.

Mean cells, analysis of variance approach , type IV hypotheses

An approach that is recommended in the literature and seems to be preferable is the study of meaningful (in terms of research tasks) a priori hypotheses about the means observed in the cells of the plan. A detailed discussion of this approach can be found in Dodge (1985), Heiberger (1989), Milliken and Johnson (1984), Searle (1987), or Woodward, Bonett, and Brecht (1990). Sums of squares associated with hypotheses about a linear combination of means in incomplete designs, investigating estimates of part of the effects, are also called sums of squares. IV.

Automatic generation of type hypothesesIV. When multivariate designs have a complex missing cell pattern, it is desirable to define orthogonal (independent) hypotheses whose investigation is equivalent to the investigation of main effects or interactions. Algorithmic (computational) strategies (based on the pseudo-inverse design matrix) have been developed to generate appropriate weights for such comparisons. Unfortunately, the final hypotheses are not uniquely determined. Of course, they depend on the order in which the effects were defined and are rarely easy to interpret. Therefore, it is recommended to carefully study the nature of the missing cells, then formulate hypotheses typeIV, that are most relevant to the objectives of the study. Then explore these hypotheses using the option Scheduled Comparisons in the window results. The easiest way to specify comparisons in this case is to require the introduction of a vector of contrasts for all factors together in the window Scheduled comparisons. After calling the dialog box Scheduled Comparisons all groups of the current plan will be shown and those that are omitted will be marked.

Skipped Cells and Specific Effect Check

There are several types of plans in which the location of the missing cells is not random, but carefully planned, which allows a simple analysis of the main effects without affecting other effects. For example, when the required number of cells in a plan is not available, plans are often used. latin squares to estimate the main effects of several factors with a large number of levels. For example, a 4 x 4 x 4 x 4 factorial design requires 256 cells. At the same time, you can use Greco-Latin square to estimate the main effects, having only 16 cells in the plan (chap. Experiment planning, volume IV, contains detailed description such plans). Incomplete designs in which the main effects (and some interactions) can be estimated using simple linear combinations of means are called balanced incomplete plans.

In balanced designs, the standard (default) method of generating contrasts (weights) for main effects and interactions will then produce a variance table analysis in which the sums of squares for the respective effects do not mix with each other. Option Specific Effects window results will generate missing contrasts by writing zero to the missing plan cells. Immediately after the option is requested Specific Effects for a user studying some hypothesis, a table of results appears with the actual weights. Note that in a balanced design, the sums of squares of the respective effects are computed only if those effects are orthogonal (independent) to all other principal effects and interactions. Otherwise, use the option Scheduled Comparisons to explore meaningful comparisons between means.

Missing Cells and Combined Error Effects/Members

If option Regression approach in the launch panel of the module Analysis of variance is not selected, the cell averages model will be used when calculating the sum of squares for the effects (default setting). If the design is not balanced, then when combining non-orthogonal effects (see above discussion of the option Missing cells and specific effect) one can obtain a sum of squares consisting of non-orthogonal (or overlapping) components. The results obtained in this way are usually not interpretable. Therefore, one must be very careful when choosing and implementing complex incomplete experimental designs.

There are many books that discuss plans in detail. different type. (Dodge, 1985; Heiberger, 1989; Lindman, 1974; Milliken and Johnson, 1984; Searle, 1987; Woodward and Bonett, 1990), but this kind of information is outside the scope of this textbook. However, later in this section we will show the analysis various types plans.

Assumptions and Assumption Violation Effects

Deviation from the assumption of normal distributions

Assume that the dependent variable is measured on a numerical scale. Let's also assume that the dependent variable has a normal distribution within each group. Analysis of variance contains a wide range of graphs and statistics to substantiate this assumption.

Violation effects. Generally F the criterion is very resistant to deviation from normality (see Lindman, 1974 for detailed results). If the kurtosis is greater than 0, then the value of the statistic F may become very small. The null hypothesis is accepted, although it may not be true. The situation is reversed when the kurtosis is less than 0. The skewness of the distribution usually has little effect on F statistics. If the number of observations in a cell is large enough, then the deviation from normality has no special significance by virtue of central limit theorem , according to which, the distribution of the mean value is close to normal, regardless of the initial distribution. Detailed discussion of sustainability F statistics can be found in Box and Anderson (1955), or Lindman (1974).

Homogeneity of dispersion

Assumptions. It is assumed that the variances of different groups of the plan are the same. This assumption is called the assumption dispersion homogeneity. Recall that at the beginning of this section, when describing the calculation of the sum of squared errors, we performed summation within each group. If the variances in two groups differ from each other, then adding them is not very natural and does not give an estimate of the total within-group variance (since in this case there is no general variance at all). Module Dispersion analysis -ANOVA/MANOVA contains a large set statistical criteria detection of deviations from the assumptions of homogeneity of the variance.

Violation effects. Lindman (1974, p. 33) shows that F the criterion is quite stable with respect to the violation of the assumptions of homogeneity of the variance ( heterogeneity dispersion, see also Box, 1954a, 1954b; Hsu, 1938).

Special case: correlation of means and variances. There are times when F statistics can mislead. This happens when the mean values ​​in the design cells are correlated with the variance. Module Analysis of variance allows you to plot variance or standard deviation scatterplots against means to detect such a correlation. The reason why such a correlation is dangerous is as follows. Let's imagine that there are 8 cells in the plan, 7 of which have almost the same average, and in one cell the average is much larger than the rest. Then F the test can detect a statistically significant effect. But suppose that in a cell with a large mean value and the variance is much larger than the others, i.e. the mean and variance in the cells are dependent (the larger the mean, the greater the variance). In this case, the large mean is unreliable, as it may be caused by a large variance in the data. However F statistics based on united variance within cells will capture a large mean, although criteria based on variance in each cell will not consider all differences in the means to be significant.

This nature of the data (large mean and large variance) is often encountered when there are outlier observations. One or two outlier observations strongly shift the mean and greatly increase the variance.

Homogeneity of variance and covariance

Assumptions. In multivariate designs, with multivariate dependent measures, the homogeneity of variance assumptions described earlier also apply. However, since there are multivariate dependent variables, it is also required that their cross-correlations (covariances) be uniform across all plan cells. Module Analysis of variance offers different ways testing these assumptions.

Violation effects. Multidimensional analog F- criterion - λ-test of Wilks. Not much is known about the stability (robustness) of the Wilks λ-test with respect to the violation of the above assumptions. However, since the interpretation of module results Analysis of variance is usually based on the significance of one-dimensional effects (after establishing the significance general criterion), the discussion of robustness concerns mainly one-dimensional analysis of variance. Therefore, the significance of one-dimensional effects should be carefully examined.

Special case: analysis of covariance. Particularly severe violations of the homogeneity of variance/covariance can occur when covariates are included in the design. In particular, if the correlation between covariates and dependent measures is different in different cells of the design, misinterpretation of the results may follow. It should be remembered that in the analysis of covariance, in essence, a regression analysis is performed within each cell in order to isolate that part of the variance that corresponds to the covariate. The homogeneity of variance/covariance assumption assumes that this regression analysis is performed under the following constraint: all regression equations(slopes) are the same for all cells. If this is not intended, then large errors may occur. Module Analysis of variance has several special criteria to test this assumption. It may be advisable to use these criteria in order to make sure that the regression equations for different cells are approximately the same.

Sphericity and complex symmetry: reasons for using a multivariate repeated measures approach in analysis of variance

In designs containing repeated measures factors with more than two levels, the application of univariate analysis of variance requires additional assumptions: complex symmetry assumptions and sphericity assumptions. These assumptions are rarely met (see below). Therefore, in recent years, multivariate analysis of variance has gained popularity in such plans (both approaches are combined in the module Analysis of variance).

Complex symmetry assumption The complex symmetry assumption is that the variances (total within-group) and covariances (by group) for different repeated measures are uniform (the same). This is a sufficient condition for a univariate repeated measures F test to be valid (i.e., the reported F-values ​​are, on average, consistent with the F-distribution). However, in this case this condition is not necessary.

Assumption of sphericity. The assumption of sphericity is necessary and sufficient condition for the F-test to be justified. It consists in the fact that within the groups all observations are independent and equally distributed. The nature of these assumptions, as well as the impact of their violations, is usually not well described in books on analysis of variance - this one will be described in the following paragraphs. It will also show that the results of the univariate approach may differ from the results of the multivariate approach and explain what this means.

The need for independence of hypotheses. The general way to analyze data in analysis of variance is model fit. If, with respect to the model corresponding to the data, there are some a priori hypotheses, then the variance is split to test these hypotheses (criteria for main effects, interactions). From a computational point of view, this approach generates some set of contrasts (set of comparisons of means in the design). However, if the contrasts are not independent of each other, the partitioning of the variances becomes meaningless. For example, if two contrasts A and B are identical and the corresponding part is selected from the variance, then the same part is selected twice. For example, it is silly and pointless to single out two hypotheses: “the mean in cell 1 is higher than the mean in cell 2” and “the mean in cell 1 is higher than the mean in cell 2”. So the hypotheses must be independent or orthogonal.

Independent hypotheses in repeated measurements. General algorithm, implemented in the module Analysis of variance, will try to generate independent (orthogonal) contrasts for each effect. For the repeated measures factor, these contrasts give rise to many hypotheses about differences between the levels of the considered factor. However, if these differences are correlated within groups, then the resulting contrasts are no longer independent. For example, in training where learners are measured three times in one semester, it may happen that changes between 1st and 2nd dimensions are negatively correlated with the change between 2nd and 3rd dimensions of subjects. Those who have mastered most of the material between the 1st and 2nd dimensions master a smaller part during the time that has passed between the 2nd and 3rd dimensions. In fact, for most cases where analysis of variance is used in repeated measurements, it can be assumed that changes in levels are correlated across subjects. However, when this happens, the complex symmetry and sphericity assumptions are not met and independent contrasts cannot be computed.

The impact of violations and ways to correct them. When complex symmetry or sphericity assumptions are not met, analysis of variance can produce erroneous results. Before multivariate procedures were sufficiently developed, several assumptions were made to compensate for violations of these assumptions. (See, for example, Greenhouse & Geisser, 1959 and Huynh & Feldt, 1970). These methods are still widely used today (which is why they are presented in the module Analysis of variance).

Multivariate analysis of variance approach to repeated measurements. In general, the problems of complex symmetry and sphericity refer to the fact that the sets of contrasts included in the study of the effects of repeated measures factors (with more than 2 levels) are not independent of each other. However, they do not have to be independent if they are used. multidimensional a criterion for simultaneously testing the statistical significance of two or more repeated measures factor contrasts. This is the reason why multivariate analysis of variance methods have become increasingly used to test the significance of univariate repeated measures factors with more than 2 levels. This approach is widely used because it generally does not require the assumption of complex symmetry and the assumption of sphericity.

Cases in which the multivariate analysis of variance approach cannot be used. There are examples (plans) when the multivariate analysis of variance approach cannot be applied. These are usually cases where there is a small number of subjects in the design and many levels in the repeated measures factor. Then there may be too few observations to perform a multivariate analysis. For example, if there are 12 entities, p = 4 repeated measurements factor, and each factor has k = 3 levels. Then the interaction of 4 factors will “expend” (k-1)P = 2 4 = 16 degrees of freedom. However, there are only 12 subjects, hence a multivariate test cannot be performed in this example. Module Analysis of variance will independently detect these observations and calculate only one-dimensional criteria.

Differences in univariate and multivariate results. If the study includes a large number of repeated measures, there may be cases where the univariate repeated measures approach of ANOVA yields results that are very different from those obtained with the multivariate approach. This means that the differences between the levels of the respective repeated measurements are correlated across subjects. Sometimes this fact is of some independent interest.

Multivariate analysis of variance and structural modeling of equations

In recent years, structural equation modeling has become popular as an alternative to multivariate dispersion analysis (see, for example, Bagozzi and Yi, 1989; Bagozzi, Yi, and Singh, 1991; Cole, Maxwell, Arvey, and Salas, 1993). This approach allows you to test hypotheses not only about the means in different groups, but also about the correlation matrices of dependent variables. For example, you can relax the assumptions about the homogeneity of the variance and covariance and explicitly include errors in the model for each group of variance and covariance. Module STATISTICSStructural Equation Modeling (SEPATH) (see Volume III) allows for such an analysis.

Analysis of variance

1. The concept of analysis of variance

Analysis of variance- this is an analysis of the variability of a trait under the influence of any controlled variable factors. In foreign literature, analysis of variance is often referred to as ANOVA, which translates as analysis of variance (Analysis of Variance).

The task of analysis of variance consists in isolating the variability of a different kind from the general variability of the trait:

a) variability due to the action of each of the studied independent variables;

b) variability due to the interaction of the studied independent variables;

c) random variation due to all other unknown variables.

The variability due to the action of the studied variables and their interaction correlates with random variability. An indicator of this ratio is Fisher's F test.

The formula for calculating the criterion F includes estimates of variances, that is, the distribution parameters of a sign, therefore the criterion F is a parametric criterion.

Than in more the variability of a trait is due to the studied variables (factors) or their interaction, the higher empirical values ​​of the criterion.

Zero the hypothesis in the analysis of variance will say that the average values ​​of the studied effective feature in all gradations are the same.

Alternative the hypothesis will state that the average values ​​of the effective attribute in different gradations of the studied factor are different.

Analysis of variance allows us to state a change in a trait, but does not indicate direction these changes.

Let's start the analysis of variance with the simplest case, when we study the action of only one variable (single factor).

2. One-way analysis of variance for unrelated samples

2.1. Purpose of the method

The method of single-factor analysis of variance is used in those cases when changes in the effective attribute are studied under the influence of changing conditions or gradations of any factor. In this version of the method, the influence of each of the gradations of the factor is various sample of test subjects. There must be at least three gradations of the factor. (There may be two gradations, but in this case we will not be able to establish nonlinear dependencies and it seems more reasonable to use simpler ones).

A non-parametric variant of this type of analysis is the Kruskal-Wallis H test.

Hypotheses

H 0: Differences between factor grades (different conditions) are no more pronounced than random differences within each group.

H 1: Differences between factor gradations (different conditions) are more pronounced than random differences within each group.

2.2. Limitations of univariate analysis of variance for unrelated samples

1. Univariate analysis of variance requires at least three gradations of the factor and at least two subjects in each gradation.

2. The resultant trait must be normally distributed in the study sample.

True, it is usually not indicated whether we are talking about the distribution of a trait in the entire surveyed sample or in that part of it that makes up the dispersion complex.

3. An example of solving the problem by the method of single-factor analysis of variance for unrelated samples using the example:

Three different groups of six subjects received lists of ten words. Words were presented to the first group at a low rate of 1 word per 5 seconds, to the second group at an average rate of 1 word per 2 seconds, and to the third group at a high rate of 1 word per second. Reproduction performance was predicted to depend on the speed of word presentation. The results are presented in Table. one.

Number of words reproduced Table 1

subject number

low speed

average speed

high speed

total amount

H 0: Differences in word volume between groups are no more pronounced than random differences inside each group.

H1: Differences in word volume between groups are more pronounced than random differences inside each group. Using the experimental values ​​presented in Table. 1, we will establish some values ​​that will be needed to calculate the criterion F.

The calculation of the main quantities for one-way analysis of variance is presented in the table:

table 2

Table 3

Sequence of Operations in One-Way ANOVA for Disconnected Samples

Frequently used in this and subsequent tables, the designation SS is an abbreviation for "sum of squares". This abbreviation is most often used in translated sources.

SS fact means the variability of the trait, due to the action of the studied factor;

SS common- general variability of the trait;

S CA- variability due to unaccounted for factors, "random" or "residual" variability.

MS - "middle square", or the mean of the sum of squares, the average value of the corresponding SS.

df - the number of degrees of freedom, which, when considering nonparametric criteria, we denoted by the Greek letter v.

Conclusion: H 0 is rejected. H 1 is accepted. Differences in the volume of word reproduction between groups are more pronounced than random differences within each group (α=0.05). So, the speed of presentation of words affects the volume of their reproduction.

An example of solving the problem in Excel is presented below:

Initial data:

Using the command: Tools->Data Analysis->One-way analysis of variance, we get the following results: