Statistics
Choosing a statistical test
Rajib Biswas | 25 May 2020

Table of Contents:
Choosing the right statistical method to perform with your data is a critical decision to make. Without proper knowledge to the variables and what your target is, it is very difficult to get to the result that you want. There are several ways to decide what statistical test should be done with the data you have and what you want to see. I’m going to give some of the most useful examples of them here in this post.
From Intuitive Biostatistics
In the book Intuitive Biostatistics by Harvey Motulsky (Copyright © 1995 by Oxford University Press Inc.) this following beautiful table has been crafted in chapter 37 which I think is a very good starting point to decision making.
Type of Data | ||||
Goal | Measurement (from Gaussian Population) | Rank, Score, or Measurement (from Non- Gaussian Population) | Binomial (Two Possible Outcomes) | Survival Time |
Describe one group | Mean, SD | Median, interquartile range | Proportion | Kaplan Meier survival curve |
Compare one group to a hypothetical value | One-sample ttest | Wilcoxon test | Chi-square or Binomial test ** | |
Compare two unpaired groups | Unpaired t test | Mann-Whitney test | Fisher’s test (chi-square for large samples) | Log-rank test or Mantel-Haenszel* |
Compare two paired groups | Paired t test | Wilcoxon test | McNemar’s test | Conditional proportional hazards regression* |
Compare three or more unmatched groups | One-way ANOVA | Kruskal-Wallis test | Chi-square test | Cox proportional hazard regression** |
Compare three or more matched groups | Repeated-measures ANOVA | Friedman test | Cochrane Q** | Conditional proportional hazards regression** |
Quantify association between two variables | Pearson correlation | Spearman correlation | Contingency coefficients** | |
Predict value from another measured variable | Simple linear regression or Nonlinear regression | Nonparametric regression** | Simple logistic regression* | Cox proportional hazard regression* |
Predict value from several measured or binomial variables | Multiple linear regression* or Multiple nonlinear regression** | Multiple logistic regression* | Cox proportional hazard regression* |
For further reading, have a look at this post by GraphPad.