Diagnostics and Remedial Measures

The interpretation of data based on analysis of variance (ANOVA) is valid only when the following assumptions are satisfied:
1. Additive Effects: Treatment effects and block (environmental) effects are additive.
2. Independence of errors: Experimental errors are independent.
3. Homogeneity of Variances: Errors have common variance.
4. Normal Distribution: Errors follow a normal distribution.
Also the statistical tests t, F, z, etc. are valid under the assumption of independence of errors and normality of errors. The departures from these assumptions make the interpretation based on these statistical techniques invalid. Therefore, it is necessary to detect the deviations and apply the appropriate remedial measures.
• The assumption of independence of errors, i.e., error of an observation is not related to or depends upon that of another. This assumption is usually assured with the use of proper randomization procedure. However, if there is any systematic pattern in the arrangement of treatments from one replication to another, errors may be non-independent. This may be handled by using nearest neighbour methods in the analysis of experimental data.
• The assumption of additive effects can be defined and detected in the following manner:

Additive Effects: The effects of two factors, say, treatment and replication, are said to be additive if the effect of one-factor remains constant over all the levels of other factors. A hypothetical set of data from a randomized complete block (RCB) design, with 2 treatments and 2 replications, with additive effects is as
Treatment                        Replication       Replication Effect
                                         I           II            I - II
A                                    190       125            65
B                                    170       105            65
Treatment Effect (A-B)    20         20
Here, the treatment effect is equal to 20 for both replications and replication effect is 65 for both treatments.
When the effect of one factor is not constant at all the levels of other factor, the effects are said to be non-additive.

Normality of Errors: The assumptions of homogeneity of variances and normality are generally violated together. To test the validity of normality of errors for the character under study, one can take help of Normal Probability Plot, Anderson-Darling Test, D'Augstino's Test, Shapiro - Wilk's Test, Ryan-Joiner test, Kolmogrov-Smirnov test, etc. In general moderate departures from normality are of little concern in the fixed effects ANOVA as F - test is slightly affected but in case of random effects, it is more severely impacted by non-normality. The significant deviations of errors from normality, makes the inferences invalid. So before analyzing the data, it is necessary to convert the data to a scale that it follows a normal distribution. In the data from designed field experiments, we do not directly use the original data for testing of normality or homogeneity of observations because this is embedded with the treatment effects and some of other effects like block, row, column, etc. So there is need to eliminate these effects from the data before testing the assumptions of normality and homogeneity of variances. For eliminating the treatment effects and other effects we fit the model corresponding to the design adopted and estimate the residuals. These residuals are then used for testing the normality of the observations. In other words, we want to test the null hypothesis H0: errors are normally distributed against alternative hypothesis H1: errors are not normally distributed. In SAS and SPSS commonly used tests are Shapiro-Wilk test and Kolmogrov-Smirnov test. MINITAB uses three tests viz. Anderson-Darling, Ryan-Joiner, Kolmogrov-Smirnov for testing the normality of data.

Homogeneity of Error Variances: A crude method for detecting the heterogeneity of variances is based on scatter plots of means and variance or range of observations or errors, residual vs fitted values, etc.
Based on these scatter plots, the heterogeneity of variances can be classified into two types:
1. Where the variance is functionally related to mean.
2. Where there is no functional relationship between the variance and the mean.
The scatter-diagram of means and variances of observations for each treatment across the replications gives only a preliminary idea about homogeneity of error variances. Statistically the homogeneity of error variances is tested using Bartlett's test for normally distributed errors and Levene test for non-normal errors.

Remedial Measures: Data transformation is the most appropriate remedial measure, in the situation where the variances are heterogeneous and are some functions of means. With this technique, the original data are converted to a new scale resulting into a new data set that is expected to satisfy the homogeneity of variances. Because a common transformation scale is applied to all observations, the comparative values between treatments are not altered and comparison between them remains valid.
Error partitioning is the remedial measure of heterogeneity that usually occurs in experiments, where, due to the nature of treatments tested some treatments have errors that are substantially higher (lower) than others.
Here, we shall concentrate on those situations where character under study is non-normal and variances are heterogeneous. Depending upon the functional relationship between variances and means, suitable transformation is adopted. The transformed variate should satisfy the following:
1. The variances of the transformed variate should be unaffected by changes in the means. This is also called the variance stabilizing transformation.
2. It should be normally distributed.
3. It should be one for which effects are linear and additive.
4. The transformed scale should be such for which an arithmetic average from the sample is an efficient estimate of true mean.
The following are the three transformations, which are being used most commonly.