McNemar’s Test: This is a non-parametric test for the paired nominal data. This test is used when we want to find the change in proportion for the paired data. This test is also known as McNemar’s Chi-Square test. This is because the test statistic has a chi-square distribution.
Assumptions for the McNemar Test:
Below are the main assumptions for the test:
- We must have one nominal variable with two categories (dichotomous variables) and one independent variable with two connected groups.
- Two sets of the dependent variables must be mutually exclusive. In simple words, participants cannot be part of more than one group.
- The sample must be a random sample.
Conduct McNemar’s Test in Python:
Let us consider that the researchers are interested to know if an advertisement video of a certain product can alter people’s opinion regarding the product. A survey of 50 people took place to find out if they want to purchase the product. Then, the advertisement video was shown to all the 50 people and the survey took place again after they watched the video. The data is represented in the below table.
Support | Do Not Support | |
---|---|---|
Before advertisement Video | 30 | 20 |
After advertisement Video | 10 | 40 |
To know whether there is a statistically significant difference in the proportion of people who want to purchase the product before and after watching the video, we can conduct the McNemar’s Test. Performing McNermar’s test is a step-by-step process. These steps are explained below.
Step 1: Create the data.
Python3
# Create a dataset data = [[ 30 , 20 ], [ 10 , 40 ]] |
Step 2: Conduct McNemar’s test.
Now let us conduct the McNemar’s test. Statsmodels provides mcnemar() function in the Python library whose syntax is given below.
Syntax:
mcnemar(table, exact=True, correction=True)
Parameters:
- table: It represents the square contingency table
- exact = True: The binomial distribution will be used.
- exact = False: The Chi-Square distribution will be used
- correction = True: Then the continuity correction would be used. As a rule. this correction would be applied any cell counts in the table is not more than 4
Example:
Python3
# Import library from statsmodels.stats.contingency_tables import mcnemar # Create a dataset data = [[ 30 , 20 ], [ 10 , 40 ]] # McNemar's Test without any continuity correction print (mcnemar(data, exact = False )) # McNemar's Test with the continuity correction print (mcnemar(data, exact = False , correction = False )) |
Output:
For both cases, whether we apply the continuity correction or not, the p-value of the test is not less than 0.05. This means in both cases we cannot reject the null hypothesis and can come to a conclusion that the proportion of people who supported the product before and after watching the marketing video was not statistically significantly different.