May 2010

One sample t-test

One sample t-test, tests whether a sample mean significantly differs from a hypothesized value

One sample median test

One sample median test is used to test whether a sample median differs significantly from a hypothesized value.

Binomial test

Binomial test is used to test whether the proportion of successes on a two-level categorical dependent variable significantly differs from a hypothesized value.

Chi-square goodness of fit

A chi-square goodness of fit test is used to test whether the observed proportions for a categorical variable differ from hypothesized proportions. For example, let's suppose that we believe that the general population consists of x% Hispanic, y% Asian, z% African American and a% White folks. We want to test whether the observed proportions from our sample differ significantly from these hypothesized proportions.

Two independent samples t-test

An independent samples t-test is used when you want to compare the means of a normally distributed interval dependent variable for two independent groups.

Wilcoxon-Mann-Whitney test

The Wilcoxon-Mann-Whitney test is a non-parametric analog to the independent samples t-test and can be used when you do not assume that the dependent variable is a normally distributed interval variable (you only assume that the variable is at least ordinal).

Chi-square test

A chi-square test is used when you want to see if there is a relationship between two categorical variables. Remember that the chi-square test assumes that the expected value for each cell is five or higher.

Fisher's exact test

The Fisher's exact test is used when you want to conduct a chi-square test but one or more of your cells has an expected frequency of five or less. Remember that the chi-square test assumes that each cell has an expected frequency of five or more, but the Fisher's exact test has no such assumption and can be used regardless of how small the expected frequency is.

One-way ANOVA

A one-way analysis of variance (ANOVA) is used when you have a categorical independent variable (with two or more categories) and a normally distributed interval dependent variable and you wish to test for differences in the means of the dependent variable broken down by the levels of the independent variable.

Kruskal Wallis test

The Kruskal Wallis test is used when you have one independent variable with two or more levels and an ordinal dependent variable. In other words, it is the non-parametric version of ANOVA and a generalized form of the Mann-Whitney test method since it permits two or more groups.

Paired t-test

A paired (samples) t-test is used when you have two related observations (i.e., two observations per subject) and you want to see if the means on these two normally distributed interval variables differ from one another.

Wilcoxon signed rank sum test

The Wilcoxon signed rank sum test is the non-parametric version of a paired samples t-test. You use the Wilcoxon signed rank sum test when you do not wish to assume that the difference between the two variables is interval and normally distributed (but you do assume the difference is ordinal).

One-way repeated measures ANOVA

You would perform a one-way repeated measures analysis of variance if you had one categorical independent variable and a normally distributed interval dependent variable that was repeated at least twice for each subject. This is the equivalent of the paired samples t-test, but allows for two or more levels of the categorical variable. This tests whether the mean of the dependent variable differs by the categorical variable.

Factorial ANOVA

A factorial ANOVA has two or more categorical independent variables (either with or without the interactions) and a single normally distributed interval dependent variable.

Friedman test

You perform a Friedman test when you have one within-subjects independent variable with two or more levels and a dependent variable that is not interval and normally distributed (but at least ordinal). We will use this test to determine if there is a difference in the reading, writing and math scores. The null hypothesis in this test is that the distribution of the ranks of each type of score (i.e., reading, writing and math) are the same.

Factorial logistic regression

A factorial logistic regression is used when you have two more categorical independent variables but a dichotomous dependent variable.

Correlation

A correlation is useful when you want to see the relationship between two (or more) normally distributed interval variables. By squaring the correlation and then multiplying by 100, you can determine what percentage of the variability is shared.

Simple linear regression

Simple linear regression allows us to look at the linear relationship between one normally distributed interval predictor and one normally distributed interval outcome variable.

Non-parametric correlation

A Spearman correlation is used when one or both of the variables are not assumed to be normally distributed and interval (but are assumed to be ordinal). The values of the variables are converted in ranks and then correlated.

Simple logistic regression

Logistic regression assumes that the outcome variable is binary (i.e., coded as 0 and 1).

Multiple regression

Multiple regression is very similar to simple regression, except that in multiple regression you have more than one predictor variable in the equation.

Analysis of covariance

Analysis of covariance is like ANOVA, except in addition to the categorical predictors you also have continuous predictors as well.

Multiple logistic regression

Multiple logistic regression is like simple logistic regression, except that there are two or more predictors. The predictors can be interval variables or dummy variables, but cannot be categorical variables. If you have categorical predictors, they should be coded into one or more dummy variables.

Discriminant analysis

Discriminant analysis is used when you have one or more normally distributed interval independent variables and a categorical dependent variable. It is a multivariate technique that considers the latent dimensions in the independent variables for predicting group membership in the categorical dependent variable.

One-way MANOVA

MANOVA (multivariate analysis of variance) is like ANOVA, except that there are two or more dependent variables. In a one-way MANOVA, there is one categorical independent variable and two or more dependent variables.

Multivariate multiple regression

Multivariate multiple regression is used when you have two or more variables that are to be predicted from two or more predictor variables.

Canonical correlation

Canonical correlation is a multivariate technique used to examine the relationship between two groups of variables. For each set of variables, it creates latent variables and looks at the relationships among the latent variables. It assumes that all variables in the model are interval and normally distributed.

Factor analysis

Factor analysis is a form of exploratory multivariate analysis that is used to either reduce the number of variables in a model or to detect relationships among variables. All variables involved in the factor analysis need to be interval and are assumed to be normally distributed. The goal of the analysis is to try to identify factors which underlie the variables. There may be fewer factors than variables, but there may not be more factors than variables.

Results of statistical analyses can indicate the precision of the results, give further description of the data or demonstrate the statistical significance of comparisons. Where statistical significance is referred to in text, the reference should be included in such a way as to minimize disruption to the flow of the text. Significance probabilities can either be presented by reference to conventional levels, e.g. (P < 0.05) or, more informatively, by stating the exact probability, e.g. (P = 0.023). An alternative to including a large number of statements about significance is to include an overall covering sentence at the beginning of the results section, or some other suitable position. An example of such a sentence is:- 'All treatment differences referred to in the results are statistically significant at least at the 5% level unless otherwise stated.'

Descriptive Statistics

When simply describing a set of data with summary statistics, useful statistics to present are the mean, the number of observations and a measure of the variation or "scatter" of the observations, as well as the units of measurement. The range or the standard deviation (SD) is useful measures of the variation in the data. The standard error (SE) is not relevant in this context, since it measures the precision with which the mean of the data estimates the mean of a larger population. If there are a large number of variables to be described the means, SDs etc. should be presented in a table. However if there are only one or two variables, these results can be included in the text.

For example:-'The initial weights of 48 ewes in the study had a mean of 34.7 kg and ranged from 29.2 to 38.6 kg." or

'The mean initial weight of ewes in the study was 34.7 kg (n = 48, SD = 2.61)".

Analyses of Variance

In most situations, the only candidates from the analysis of variance table for presentation are the significance probabilities of the various factors and interactions and sometimes the residual variance or standard deviation. When included, they should be within the corresponding table of means, rather than in a separate table.

In general, authors should present relevant treatment means, a measure of their precision, and maybe significance probabilities. The treatment means are the primary information to be presented, with measures of precision being of secondary importance. The layout of the table should reflect these priorities; the secondary information should not obscure the main message of the data. The layout of tables of means depends, as is shown below, on the design of the experiment, in particular on:-

* whether the design is balanced i.e. equal numbers of observations per treatment;

* whether the treatments have a factorial structure.

* for factorial designs, whether or not there are interactions.

Measures of Precision

The measure of precision should be either a standard error of a mean (SE), or a standard error of the difference between two means (SED), or a least significant difference (LSD). In the latter case, the significance level used should be stated, e.g. 5% LSD. The SED and LSD are usually only suitable for balanced designs. For balanced designs 5% LSD ( 2 ( SED and SED = (2 ( SE ( (one and a half) x SE.

Only one of these three statistics is necessary and it is important to make it clear which is being used. Preference is for the standard error (SE). It is the simplest. We can always multiply by 11/2 or 3 to give the SED or 5% LSD, and we can use the SE for both balanced and unbalanced situations. Measures of precision are usually presented with one more decimal place than the means. This is not a strict rule. For example a mean of 74 with a standard error of 32 is fine, but a mean of 7.4, with a standard error of 0.3, should have the extra decimal place and be given as 0.32.

Some researchers like to include the results of a multiple comparison procedure such as Fisher's LSD. These are added as a column with a series of letters, (a, b, c, etc) where treatments with the same letter are not significantly different. Often, though, these methods are abused. The common multiple comparison procedures are only valid when there is no "structure" in the set of treatments, e.g. when a number of different plant accessions or sources of protein are being compared.

In addition a single standard error or LSD is given in the balanced case, individual standard errors in the unbalanced case. The results from a multiple comparison procedure are additional to, and not a substitute for, the reporting of the standard errors.

Single Factor Experiments

The most straightforward case is a balanced design with simple treatments. Here each treatment has the same precision, so only one SE (or SED or LSD) per variable is needed. In the table of results, each row should present means for one treatment; results for different variables are presented in columns. The statistical analysis results are presented as one or two additional rows: one giving SEs (or SEDs or LSDs) and the other possibly giving significance probabilities. If the F-probabilities are given we suggest that the actual probabilities be reported, rather than just the levels of significance, e.g. 5% (or 0.05), 0.01 or 0.001. In particular, reporting a result was "not significant", often written as "ns", is not helpful. In interpreting the results, it is sometimes useful to know if the level of significance was 6% or 60%. If specified contrasts are of interest, e.g. polynomial contrasts for quantitative treatments, their individual F-test probabilities should be presented with or instead of the overall F-test.

For unbalanced experiments each treatment mean has a different precision. Then the best way of presenting results is to include a separate column of standard errors next to each column of means and also a column containing the number of observations. If the number of observations per group is the same for the different variables, then only one column of numbers should be presented (usually the first column). Otherwise a separate column will be needed for each variable. If there is little variation in the number of observations per treatment, and therefore little variation in the standard errors, it is sometimes possible to use an "average" SE or SED and present results as if the experiment was balanced. This should only be done if it does not distort the results, and it should be clearly stated that this procedure has been used. Alternatively, if the numbers per group are not too unequal, another method is to present the residual standard deviation for each variable instead of a column of individual standard errors. If the groups are of very unequal size, the above compromises should not be used and the number of variables presented in any one table should be reduced.

Factorial Experiments

For factorial experiments there is usually more statistical information to present. Also the means to be presented will depend on whether or not there are interactions which are both statistically significant and of practical importance.

This section discusses two-factor experiments, but the recommendations can be easily extended to more complex cases. It also assumes a balanced experiment with equal numbers of observations for each treatment. However, the recommendations can be combined in a fairly straightforward manner with those above for the unbalanced case.

If there is no interaction then the "main effect" means should be presented. For example a 3*2 factorial experiment on sheep nutrition might have three "levels" of supplementation (None, Medium and High) and two levels of parasite control (None and Drenched), giving six treatments in total. There are five main effect means: three means for the levels of supplementation, averaged over the two levels of parasite control, and two means for the levels of parasite control. In this example there would also be two SEs and two significance probabilities for each variable, corresponding to the two factors.

If there are interactions which are statistically significant and of practical importance, then main effect means alone are of limited use. In this case, the individual treatment means should be presented. For a balanced design, there is now only one SE per variable (except for split-plot designs), but three rows giving F-test probabilities for the two main effects and the interaction. Additional rows for F-test probabilities can be used for results of polynomial contrasts for quantitative factors or other pre-planned contrasts.

Regression Analysis

The key results of a linear regression analysis are usually the regression coefficient (b), its standard error, the intercept or constant term, the correlation coefficient (r) and the residual standard deviation. For multiple regression there will be a number of coefficients and SEs, and the coefficient of determination (R²) will replace r. If a number of similar regression analyses have been done the relevant results can be presented in a table, with one column for each parameter.

If results of just one or two regression analyses are presented, they can be incorporated in the text. This can either be done by presenting the regression equation as in :-

'The regression equation relating dry matter yield (DM, kg/ha) to amount of phosphorus applied (P, kg/ha) is DM = 1815 + 32.1P (r = 0.74, SE of regression coefficient = 8.9).”

or presenting individual parameters as in:-

'Linear regression analysis showed that increasing the amount of phosphorous applied by 1 kg/ha increased dry matter yield by 32.1 kg/ha (SE = 8.9). The correlation coefficient was 0.74.'

It is often revealing to present a graph of the regression line. If there is only one line to present on a graph, the individual points should also be included. This is not always necessary and tends to be confusing, with more than one line. Details of the regression equation(s) and correlation coefficient(s) can be included with the graph if there is sufficient space. If this information would obscure the message of the graph, then it should be presented elsewhere.

Error Bars on Graphs and Charts

Error bars displayed on graphs or charts are sometimes very informative, while in other cases they obscure the trends which the picture is meant to demonstrate. The decision on whether or not to include error bars within the chart, or give the information as part of the caption, should depend on whether they make the graph clearer or not.

If error bars are displayed, it must be clear whether the bars refer to standard deviations, standard errors, least significant differences, confidence intervals or ranges. Where error bars representing, say, standard errors are presented, then one of the two methods below should be used.

(a) the bar is centred on the mean, with one SE above the mean and one SE below the mean. i.e. the bar has a total length of twice the SE.

(b) the bar appears either completely above or completely below the mean, and represents one SE.

If the error bar has the same length for all points in the graph, then it should be drawn only once and placed to one side of the graph, rather than on the points. This occurs with the results of balanced experiments.

Statistical Concepts and Analytics Explained

Use of Statistical Analysis?

Results of Statistical Analysis