Here you will learn the following:
How to generate a contingency table; and
how to interpret the results and related measures of association.
Contingency tables allow you to compare categorical data and find out if there is an association between the variables. This is a very basic test, that can help you understand your data better.
Dataset used in Examples
Skoczylis, Joshua, 2021, "Extremism, Life Experiences and the Internet", https://doi.org/10.7910/DVN/ICTI8T, Harvard Dataverse, Version 3.
Contigency Table: Independent Sample Chi-Square of Association
Independent Sample Chi-Square of Association Hypothesis
Ho: There is no association between gender, ethnicity and extremism
Ha: There is an association between gender, ethnicity and extremism
Independent Sample Chi-Square of Association variables required
Note: You will need to create this variable using the Transform function. We transformed the Extremism_scoreScaled variable into a new categorical variable called Extremism_cat which measures those that are Extreme (all that are 2 Standard Deviations above the mean(0.807)+ 2 x Standard Deviations (1.4)) and those who are Not Extreme (all values below 3.607).
You can use one or more of the following types of variables. The more layers to you add the more complicated your tables will be to interpret.
Independent Sample Chi-Square of Association: Step-by-Step Guide
Generate the contigency table
Before you run a contingency table, consider generating a Frequency table in the Explore section. These can give you a good idea of what might be going on in your data and help you decide whether a contingency table is necessary or not.
Once you have done this navigate to Analyses > Frequencies > Contingency Tables - Independent Sample.
In the variables field you will see the following 4 options:
Rows: The levels of any variable you drag into this field will be displayed in the rows of the table
Columns: The levels of any variable you drag into this field will be displayed across the columns of the table
Count (optional): If you want to weight your data, drag the relevant variable into this box.
Layers: This box allows us to group your Rows by another categorical variable(s). If you drag Gender here, it will break down your data into the following three groups: Male, Female and total. Don't add too many layers. Otherwise, the table will become very confusing.
Drag your variable into the relevant boxes.
Select your Statistics & Modify the table
Now you can select your Statistics.
Usually, you should continue with the default which is Chi-Square (X2). If you have a small sample you might consider the Fishers Exact Test.
If you have a 2x2 table, you may consider adding the odds ratio. Here is a short introduction on how to interpret them. You should also consider selecting Confidence intervals.
Here you can specify what type of Hypothesis you had.
In the final two sections, you select will need to select your measures of associations. Beware a significant p-value only tells you that a relationship exists, it does not tell you anything about its strength.
If your data is Nominal (or nominal and ordinal) you can select either of the options, the most common one would be the Phi and Cramer's V. Note you will only get a result for Phi if your table is a 2x2 table. All others will return a Cramer's V score.
Phi and Cramer's V are interpreted as follows:
>.5 high association
.3 to .5 moderate association
.1 to .3 low association
0 to .1 little if any association
If both your variables are ordinal, select Kendall's Tau b. Kendall Tau b is scored on a scale of -1 to +1. with negative scores suggesting a negative association and positive scores suggesting a positive association. 0 indicates no association. The closer to 0 the weaker the association.
Finally, you can modify your table. You should usually display percentages rather than counts. This gives the reader a better idea of how a variable is distributed. There may be times when you may need/want to display your counts. These can easily be included by selecting Observed/Expected Counts.
Row: Each row adds up to 100%
Column: Each Column adds up to 100% (note if you use layers it will show you the column percentages for each group)
Total: You will see the total percentages for your rows and columns.
Be careful when you interpret the percentages, be aware of which option you have selected.
You can generate Plots easily from the Plot section - but these are not great. If you want to visualise your contingency tables use either Survey Plot or JJStatsPlot.
Independent Sample Chi-Square of AssociationResults: Reject/Accept Hypothesis
Results: Reject H0 - p-value & Measures of Assoication
Reject the H0 and accept the Ha (p 0.007 for total)
The small table below provides us with Chi-Square Output. Based on the statistics
This tells us that there is a statistical relationship between Gender, Ethnicity and Extremism.
Note: The p-value only tells you that the relationship is statistically significant. It does not tell you about the strength of the association.
As we have used Gender as a Layer, we can also break down the results. The table below clearly indicates that the relationship between being a Female, Ethnicity and Extremism is not significant (p 0.992), but it is significant if for Males (p 0.008).
Measures of Association:
We now know that there is a significant relationship between Gender, Ethnicity and Extremism, and Males, Ethnicity and Extremism.
The table below provides us with some idea of how strong the relationship is. As Female is not significant, we can ignore the relationship.
In this case, the relationship between Gender, Ethnicity and Extremism is very weak (Cramer's V for all groups 0.078). For males, the relationship is also very weak (Cramer's V 0.095)
For both, we can therefore conclude that although there statistically significant relationship, its strength is almost negligible.
While you can combine Row, Column and Total Percentages, your table will look messy and will be more difficult to interpet. We suggest you generate seperate tables if needed. The outcome for all three are below.
Results: Contigency Table - Row Percentages
We can now explore the outcomes in more detail. Below is a contingency table showing Row Percentages - the row values add up to 100%. So here you are comparing how many people are extreme within one ethnic group
The table below, tells us that 9.0% of the white males had views that are deemed extreme and 91% did not have extreme views.
Of individuals with a Black/African/Caribbean/Black British background no had views that were extreme.
Results: Contigency Table - Column Percentages
With column percentages, the column adds up to 100%. So in this case you are comparing the percentage of those who are extreme between ethnicities.
Out of all the males, 95% of those who had extreme views were white compared with 2.9% of Asian/ Asian British.
Results: Contigency Table - Total Percentages
Total Percentages tell you the percentage of each cell for all the relevant rows and columns
The below table tells us that out of all the Males, 0.3% were Asian/ Asian British and Extreme, compared to 8.7% who were White and Extreme.