Lecture 17
Choosing a test

ABD 3e Interleaf 7 (p. 368)

Chris Merkord

Learning Objectives

Select an appropriate statistical test for a given research question and dataset
Use information about variables, study design, and number of groups to guide test selection
Evaluate whether the assumptions of a chosen method are reasonable
Apply a logical sequence of questions (a mental flow chart) to choose a test

Which test should I use?

Very common situation: you have a question, you collected data, but now you don’t know what test to use to answer the question.
The following tables lay out your options.
In the future, you should be able to apply this thinking to your own research.

For now, you should be able to read an example of a question and data set and decide which test is appropriate.
For examples of questions, see the Practice and Assignment problems at the end of each chapter.
Now imagine you read a question but didn’t know which chapter it came from. Could you choose the appropriate statistical procedure to answer the question?

Goal: match the method to the data and study design

Start with the question
Identify the data:
- What type of variable is the response?
- What type of variable is the explanatory variable?
- Are samples independent or paired?
- How many groups are being compared?
Choose a method that matches the data and question
Check whether the assumptions of that method are reasonable
- If not, choose a better-suited method

Choosing a test follows a logical sequence of questions (a mental flow chart)

Complex flow chart that begins with type of data (continuous, discrete, proportions) and branches through conditions such as number of groups and pairing to identify appropriate statistical tests like t-tests, ANOVA, chi-square, and nonparametric tests. — Figure 1: Flow chart illustrating how data type, number of groups, and study design guide the choice of a statistical test. Source: Osborne Nishimura Lab, Colorado State University (adapted from common statistical decision frameworks).

If you have one categorical variable…

Commonly used statistical tests for data on a single categorical variable. These methods test whether a population parameter equals the value proposed in the null hypothesis or whether a specific probability model fits a frequency distribution. Adapted from Whitlock & Schluter 3e. W.H. Freeman and Company.
Goal	Test
Use frequency data to test whether a population proportion equals a null hypothesized value	Binomial test (Lecture 9) \(\chi^2\) goodness-of-fit test with two categories (use if sample size is too large for the binomial test) (Lecture 10)
Use frequency data to test the fit of a specific population model	\(\chi^2\) goodness-of-fit test (Lecture 10)

If you have one numerical variable…

Commonly used statistical tests for data on a single numerical variable. These methods test whether a population parameter equals the value proposed in the null hypothesis or whether a specific probability model fits a frequency distribution. Adapted from Whitlock & Schluter 3e. W.H. Freeman and Company.
Goal	Test
Test whether the mean equals a null hypothesized value when data are approximately normal (possibly only after a transformation) (13)	One-sample \(t\)-test (Lecture 14)
Test whether the median equals a null hypothesized value when data are not normal (even after transformation)	Sign test (Lecture 16)
Use frequency data to test the fit of a discrete probability distribution	\(\chi^2\) goodness-of-fit test (Lecture 11)
Use data to test the fit of the normal distribution	Shapiro-Wilk test (Lecture 16)

Tests of association between two variables

Commonly used tests of association between two variables. Adapted from Whitlock & Schluter 3e. W.H. Freeman and Company.
		Type of Explanatory Variable
		Ca tegorical	Numerical
Type of Response Variable	Ca tegorical	Contingency analysis (Lecture 12)	Logistic Regression (Lecture 23)
	Numerical	See next slide	Linear Correlation (Lecture 21) and Spearman’s rank correlation (when data are not bivariate normal) (Lecture 21) Linear regression (Lecture 22) and nonlinear regression (Lecture 23)

Comparing group means

A comparison of methods to test differences between group means according to whether the tests assume normal distributions. Adapted from Whitlock & Schluter 3e. W.H. Freeman and Company.
Number of treatments	Tests assuming normal distribution	Fewer assumptions (do not require normality)
Two treatments (independent samples)	Pooled variance two-sample \(t\)-test (Lecture 15) (assumes equal variances, not commonly used) Welch’s two-sample \(t\)-test (Lecture 15) (Preferred when variances are unequal (often the default in practice)	Wilcoxon Rank-Sum Test (Mann-Whitney \(U\)-test) (Lecture 16)
Two treatments (paired samples)	Paired \(t\)-test (Lecture 15)	Sign test (Lecture 16)
More than two treatments	ANOVA (Lecture 20)	Kruskal-Wallis test (Lecture 20)

Summary: choosing a test

Identify response variable type
Identify explanatory variable(s)
Determine study design (paired vs independent)
Check assumptions
Choose method that matches data and question

Further resources

This is what you need to know for this course, but it is not a comprehensive list of hypothesis tests
Some other resources:
- What statistical test should I do? Stats and R
- Quantitative Analysis Guide, NYU Libraries
- Statistical method selection tool, Statkat