Overview & Learning Outcomes
General Reading Materials
EASY - Rowntree, D (2018) Statistics without Tears: An introduction for non-mathematicians. Penguin: London. Read: Chapter 3 & 4 (pp. 33-69)
EASY - Cole, D (2019) Statistical Testing with Jamovi and JASP Open Source Software: Criminology. Vor Press: Norwich. Read: Chapter 3 (pp. 17-23)
MODERATE - Navarro, D & Foxcroft, D (2019) Learning Statistics with Jamovi: A tutorial for Psychology Students and other Beginners. Verison 0.70. Available online.
Concepts: Parametric v non parametric test
Before we move on to Hypothesis testing and p-values let's quickly go over the difference between parametric and non-parametric tests. Don't worry about the names of the tests for now, you will be introduced and shown how to use them at the appropriate time.
Remember the normal distribution? Well, a lot of tests make certain assumptions about our data. Parametric tests assume that your data is approximately normally distributed. Parametric tests included:
Most Statitisic programmes should allow you to check whether your data meets these assumptions. If your data is not approximately normally distributed, then your should be using non-parametric or robust tests.
If you have ordinal or nominal data you will have to use non-parametric tests such as:
Mann Whitney U tests - non-parametric equivalent to a t-test
Wilcox Rank tests - non-parametric equivalent to an ANOVA
Spearman's or Kendalls Tau B correlation - non-parametric equivalent to Pearsons Correlation
In short, if your data is approximately normally distributed use a parametric test. If your data is not approximately normally distributed use non-parametric tests.
Concepts: Hypothesis testing and p-values
Descriptive data tells you about one variable, but what we are really interested in is what does the data tell us about the world around us. Hypotheses help us make inferences about our data and the world around us.
On a basic level, a hypothesis is essentially a statement that makes a claim, e.g. All cars are red. For hypotheses to be useful, we must be able to test them. In statistics, we test these statements by formulating what we call a Null Hypothesis (H0) and an Alternative Hypothesis (Ha)
The Null Hypothesis is a statement that suggests that there is no statistically significant relationship between the variables being. The best way of proving or disproving a hypothesis is to see if there are differences. So rather than finding more red cars to prove our hypothesis, we try to disprove the hypothesis by finding a car that is a different colour. The Alternative hypothesis, therefore, is the counterargument. This will help us to either accept or reject the Null Hypothesis. Here are some examples:
H0: All cars are red.
Ha: Not all cars are red.
H0: There is no difference in income between men and women.
Ha: There is a difference in income between men and women.
You can also have directional hypotheses. These are statements where the alternative hypothesis suggests a negative or positive relationship between your variables. Let's reword the hypothesis about gender and income above into a directional hypothesis:
H0: There is no difference in income between men and women
Ha: Women have a lower income than men
Notice that the Null hypothesis remains the same - there is no difference.
Once you have formulated your hypothesis you will select an appropriate statistical test to see whether to reject or accept the Null Hypothesis. You will do this using p-values and confidence intervals.
Note: Your hypothesis and the resultant p-value only tell you whether there is/is no relationship and whether this is statistically significant. It does not tell you anything about the strength of the relationship.
Inferential tests always provide you with p-values. The value is a number that describes how likely you are to get a particular observation if the null hypothesis is true. P-values will help you decide whether you should reject or accept your null hypothesis. The smaller the p-value the more likely we are to reject the null hypothesis.
A p-value is a number between 0-1.
0 indicates that there is no probability of an event occurring.
1 indicates that the event will occur.
The further your p-value shifts towards 0 the less likely it is an event will occur. In the Social Sciences, we look for a p-value of 0.05 - or a 5% chance that an event will occur (to convert your p-value into percentages just multiply the number by 100). In other words, we want to be 95% sure that the event did not occur by chance. To some extent, the 0.05 value is arbitrary. You can set is level to whatever you want. It is about setting it at a level that you are happy with. So if you are happy with a 10% failure rate, you can at 0.1. However, as mentioned, in the Social Sciences we use the 0.05 level or lower.
Remember the normal distribution? The p-value essentially tells you where on the distribution your result falls. The further your result is away from the centre the less likely it is to occur. See image below.
Let's look at an example.
H0: There is no difference in extremism scores between men and women.
Ha: There is a difference in extremism scores between men and women.
Assume that we get a p-value of 0.01. We, therefore, conclude that there is a difference between the extremism score of men and women. We can be 99% certain that the difference did not occur by chance. We, therefore have to reject H0 and accept Ha.
As a general rule if we get a p-value below 0.05 we reject the Null Hypothesis in favour of the Alternative Hypothesis.
If the p-value is above 0.05 we accept the Null Hypothesis and reject the Alternative Hypothesis.
The video below is a detailed introduction to p-values. We recommend that you watch this. For a brief and simplified overview watch this video.
You will usually report the p-values in your results section. State your p-value in the text and state whether you are accepting/rejecting your hypothesis. Report your p-value alongside other values such as correlations, confidence intervals etc.
To help you understand the concepts of probability even better, watch Part 1 and Part 2 (Crash Course in Statistics with Adrianne Hill) on probability. For a short and more simplified version watch this clip.
Standard Error & Confidence Intervals
The Standard Error (SE):
The Standard error is used to estimate the efficiency, accuracy and consistency of your sample data. In other words, it tells you how 'close' your sample data is to the real population.
If available, report this value alongside the p-value and other results of your test.
What is the Standard Error? Let's assume that you sample your population numerous times. Each time you would get slightly different results - which is what we would expect. For each sample, the mean and the standard deviation would vary slightly. The Standard Error takes the means for each of your samples and calculates the mean of these means. The Standard error is the Standard deviation of the Mean of the Means.
A small SE is an indication that your sample mean is more accurate and a better reflection of the actual population mean. Smaller sample sizes will usually result in a larger SE.
Confidence Intervals (CI):
Here is a slightly simpler video on how confidence intervals work.
Confidence intervals provide additional information about statistical significance, as well as the direction and strength of the effect. If you can get them it is always worth reporting CIs. Remember that we are working with sample data, this means we would expect it to vary slightly from the population data. Confidence intervals, like the SE, help us deal with this variation. A Confidence Interval is the mean of your estimate plus/minus the variation. If your test was redone again and again, you would expect the results to be within the confidence interval range.
Usually, you would use the 95% confidence intervals, although you can make them smaller e.g. 99%. Let's look at an example.
In the above example Party A is polling at 45% and Party B at 35%. The 95% confidence interval suggest that there is a 5% leeway either way. This means you would expect Party A to get somewhere between 35 and 45% in an election, while Party B would receive somewhere between 30 to 40% of the vote. A p-value might indicate that there a significant difference between votes for each party. However, Confidence intervals add rich additional information. As you can see from the image above, there is still signficant overlap between the potential outcomes.
Wherever possible, also report the Confidence intervals. Sometimes, even if you get a significant p-value, where confidence intervals overlap significantly, you may still want to reject your hypothesis.
Select your tests
One sample proportion tests:
Variables used: One categorical variable
These tests allow you to compare one ordinal/nominal variable with the known proportions of the real population. For example, the census provides us with detailed information about the distribution of ethnicity. ONS figures for example indicate that there 13% of the UK population are none white. If we have a variable measuring this, we can compare the proportions of our sample to that very number. The p-value would then give us an indication of whether or not our sample is representative in terms of ethnicity of the UK population or not.
Variables used: two or more categorical variables.
Contingency tables present two or more categorical variables in a tabular form (avoid using more than three variables as the table will become very messy). This table usually shows you the frequencies of particular combinations of values. Each cell in the table represents a mutually exclusive combination of two variables.
Whenever you generate a contingency table you can also get the chi-square statistics - this provides you with a p-value and tells you whether a relationship is significant or not.
Rather than presenting these tables in counts, you should use either row or column percentages.
Row percentages: All the values in your row will add up to 100%. Have a look at the below example.
What you should say: Of those who selected strongly Agree, 83.3% were male compared to 16.7% Females.
What you should not say: 83.3% of Males and 16.7% of Females Strongly agree that violence is effective in gaining respect.
Column percentages: All the values in the column will add up to 100%. In the table below you can see what percentage of females/Males selected Strongly agree to Strongly disagree. in this example you see the results broken down by the Violence _effective_respect variable.
What you should say: Only 0.1% of the females strongly agreed that violence is an effective way of gaining respect compared to 0.3% of all the males.
What you should not say: Out of those who Strongly agree with this statement 0.3% are Male and 0.1% females.
Variables used: two or more Continuous/Ordinal variables
A correlations matrix shows you the correlation coefficient between two continuous/ordinal variables and whether the correlation is statistically significant or not (p-value). This table summarised your data and provide you insights into whether or not correlations exist and whether these are significant or not.
In the image below you can see for example that the correlation between Self Confidence and Social Media use is significant (p 0.039)
Measures of Association & Correlation Coefficients
A correlation coefficient is a measure that shows you how strongly two variables are correlated. Below is a list of the ones you will use most often.
Pearsons Correlation Coefficient - use this if your is approximately normally distributed (parametric).
Spearman's Rank order Correlation - use this if your data is ordinal and/or not approximately normally distributed (non-parametric)
Kendall's Tau B - use this if your data is ordinal and/or not approximately normally distributed. Kendall's Tau is slightly more reliable than Spearman's, but they both serve the same purpose (non-parametric).
The video below provides you with a slightly more in-depth understanding of how correlation coefficients work.
For nominal data we use Phi and Cramer's V. This is a measure of association. Phi will give you data if you have a 2x2 table, otherwise stats programs will give you a value for Cramer. Here you will receive a number between 0-1. you interpret it as follows:r
0: no correlation
> 0.25: Very strong