Results of Statistical Analysis

Results of statistical analyses can indicate the precision of the results, give further description of the data or demonstrate the statistical significance of comparisons. Where statistical significance is referred to in text, the reference should be included in such a way as to minimize disruption to the flow of the text. Significance probabilities can either be presented by reference to conventional levels, e.g. (P < 0.05) or, more informatively, by stating the exact probability, e.g. (P = 0.023). An alternative to including a large number of statements about significance is to include an overall covering sentence at the beginning of the results section, or some other suitable position. An example of such a sentence is:- 'All treatment differences referred to in the results are statistically significant at least at the 5% level unless otherwise stated.'

Descriptive Statistics

When simply describing a set of data with summary statistics, useful statistics to present are the mean, the number of observations and a measure of the variation or "scatter" of the observations, as well as the units of measurement. The range or the standard deviation (SD) is useful measures of the variation in the data. The standard error (SE) is not relevant in this context, since it measures the precision with which the mean of the data estimates the mean of a larger population. If there are a large number of variables to be described the means, SDs etc. should be presented in a table. However if there are only one or two variables, these results can be included in the text.

For example:-'The initial weights of 48 ewes in the study had a mean of 34.7 kg and ranged from 29.2 to 38.6 kg." or

'The mean initial weight of ewes in the study was 34.7 kg (n = 48, SD = 2.61)".

Analyses of Variance

In most situations, the only candidates from the analysis of variance table for presentation are the significance probabilities of the various factors and interactions and sometimes the residual variance or standard deviation. When included, they should be within the corresponding table of means, rather than in a separate table.

In general, authors should present relevant treatment means, a measure of their precision, and maybe significance probabilities. The treatment means are the primary information to be presented, with measures of precision being of secondary importance. The layout of the table should reflect these priorities; the secondary information should not obscure the main message of the data. The layout of tables of means depends, as is shown below, on the design of the experiment, in particular on:-

* whether the design is balanced i.e. equal numbers of observations per treatment;

* whether the treatments have a factorial structure.

* for factorial designs, whether or not there are interactions.

Measures of Precision

The measure of precision should be either a standard error of a mean (SE), or a standard error of the difference between two means (SED), or a least significant difference (LSD). In the latter case, the significance level used should be stated, e.g. 5% LSD. The SED and LSD are usually only suitable for balanced designs. For balanced designs 5% LSD ( 2 ( SED and SED = (2 ( SE ( (one and a half) x SE.

Only one of these three statistics is necessary and it is important to make it clear which is being used. Preference is for the standard error (SE). It is the simplest. We can always multiply by 11/2 or 3 to give the SED or 5% LSD, and we can use the SE for both balanced and unbalanced situations. Measures of precision are usually presented with one more decimal place than the means. This is not a strict rule. For example a mean of 74 with a standard error of 32 is fine, but a mean of 7.4, with a standard error of 0.3, should have the extra decimal place and be given as 0.32.

Some researchers like to include the results of a multiple comparison procedure such as Fisher's LSD. These are added as a column with a series of letters, (a, b, c, etc) where treatments with the same letter are not significantly different. Often, though, these methods are abused. The common multiple comparison procedures are only valid when there is no "structure" in the set of treatments, e.g. when a number of different plant accessions or sources of protein are being compared.

In addition a single standard error or LSD is given in the balanced case, individual standard errors in the unbalanced case. The results from a multiple comparison procedure are additional to, and not a substitute for, the reporting of the standard errors.

Single Factor Experiments

The most straightforward case is a balanced design with simple treatments. Here each treatment has the same precision, so only one SE (or SED or LSD) per variable is needed. In the table of results, each row should present means for one treatment; results for different variables are presented in columns. The statistical analysis results are presented as one or two additional rows: one giving SEs (or SEDs or LSDs) and the other possibly giving significance probabilities. If the F-probabilities are given we suggest that the actual probabilities be reported, rather than just the levels of significance, e.g. 5% (or 0.05), 0.01 or 0.001. In particular, reporting a result was "not significant", often written as "ns", is not helpful. In interpreting the results, it is sometimes useful to know if the level of significance was 6% or 60%. If specified contrasts are of interest, e.g. polynomial contrasts for quantitative treatments, their individual F-test probabilities should be presented with or instead of the overall F-test.

For unbalanced experiments each treatment mean has a different precision. Then the best way of presenting results is to include a separate column of standard errors next to each column of means and also a column containing the number of observations. If the number of observations per group is the same for the different variables, then only one column of numbers should be presented (usually the first column). Otherwise a separate column will be needed for each variable. If there is little variation in the number of observations per treatment, and therefore little variation in the standard errors, it is sometimes possible to use an "average" SE or SED and present results as if the experiment was balanced. This should only be done if it does not distort the results, and it should be clearly stated that this procedure has been used. Alternatively, if the numbers per group are not too unequal, another method is to present the residual standard deviation for each variable instead of a column of individual standard errors. If the groups are of very unequal size, the above compromises should not be used and the number of variables presented in any one table should be reduced.

Factorial Experiments

For factorial experiments there is usually more statistical information to present. Also the means to be presented will depend on whether or not there are interactions which are both statistically significant and of practical importance.

This section discusses two-factor experiments, but the recommendations can be easily extended to more complex cases. It also assumes a balanced experiment with equal numbers of observations for each treatment. However, the recommendations can be combined in a fairly straightforward manner with those above for the unbalanced case.

If there is no interaction then the "main effect" means should be presented. For example a 3*2 factorial experiment on sheep nutrition might have three "levels" of supplementation (None, Medium and High) and two levels of parasite control (None and Drenched), giving six treatments in total. There are five main effect means: three means for the levels of supplementation, averaged over the two levels of parasite control, and two means for the levels of parasite control. In this example there would also be two SEs and two significance probabilities for each variable, corresponding to the two factors.

If there are interactions which are statistically significant and of practical importance, then main effect means alone are of limited use. In this case, the individual treatment means should be presented. For a balanced design, there is now only one SE per variable (except for split-plot designs), but three rows giving F-test probabilities for the two main effects and the interaction. Additional rows for F-test probabilities can be used for results of polynomial contrasts for quantitative factors or other pre-planned contrasts.

Regression Analysis

The key results of a linear regression analysis are usually the regression coefficient (b), its standard error, the intercept or constant term, the correlation coefficient (r) and the residual standard deviation. For multiple regression there will be a number of coefficients and SEs, and the coefficient of determination (R²) will replace r. If a number of similar regression analyses have been done the relevant results can be presented in a table, with one column for each parameter.

If results of just one or two regression analyses are presented, they can be incorporated in the text. This can either be done by presenting the regression equation as in :-

'The regression equation relating dry matter yield (DM, kg/ha) to amount of phosphorus applied (P, kg/ha) is DM = 1815 + 32.1P (r = 0.74, SE of regression coefficient = 8.9).”

or presenting individual parameters as in:-

'Linear regression analysis showed that increasing the amount of phosphorous applied by 1 kg/ha increased dry matter yield by 32.1 kg/ha (SE = 8.9). The correlation coefficient was 0.74.'

It is often revealing to present a graph of the regression line. If there is only one line to present on a graph, the individual points should also be included. This is not always necessary and tends to be confusing, with more than one line. Details of the regression equation(s) and correlation coefficient(s) can be included with the graph if there is sufficient space. If this information would obscure the message of the graph, then it should be presented elsewhere.

Error Bars on Graphs and Charts

Error bars displayed on graphs or charts are sometimes very informative, while in other cases they obscure the trends which the picture is meant to demonstrate. The decision on whether or not to include error bars within the chart, or give the information as part of the caption, should depend on whether they make the graph clearer or not.

If error bars are displayed, it must be clear whether the bars refer to standard deviations, standard errors, least significant differences, confidence intervals or ranges. Where error bars representing, say, standard errors are presented, then one of the two methods below should be used.

(a) the bar is centred on the mean, with one SE above the mean and one SE below the mean. i.e. the bar has a total length of twice the SE.

(b) the bar appears either completely above or completely below the mean, and represents one SE.

If the error bar has the same length for all points in the graph, then it should be drawn only once and placed to one side of the graph, rather than on the points. This occurs with the results of balanced experiments.