Select data input method, then enter data for both groups to calculate the z-score and p-value.
Group 1 - Summary Data
Group 2 - Summary Data
Z-Score:
P-Value:
Understanding the Two Sample Z-Test
Two Sample Z-Test is a statistical test used to determine whether there is a significant difference between the means of two independent groups when the population standard deviations are known. Unlike the t-test, which estimates standard deviations from sample data, the z-test requires that population variances are known, making it more suitable for larger sample sizes or scenarios with known population characteristics.
When to Use the Two Sample Z-Test
The two-sample z-test is appropriate when both groups are drawn from populations with known variances and are approximately normally distributed. This test is commonly used in cases with large sample sizes or known population variances, where the central limit theorem justifies using the z-distribution.
Real-Life Example: Comparing Two Customer Satisfaction Scores
Imagine a company wants to compare customer satisfaction scores between two of its stores to see if there's a difference:
- Group 1: Customers from Store A
- Group 2: Customers from Store B
The collected data shows:
- Mean score for Store A = 85, Population Standard Deviation = 10, Sample Size = 100
- Mean score for Store B = 88, Population Standard Deviation = 12, Sample Size = 120
Using the Two Sample Z-Test, the company can determine if there's a significant difference in satisfaction between the stores. Here are the calculation steps:
Step-by-Step Calculation
1. Calculate the z-score:
$$ z = \frac{\bar{X}_1 - \bar{X}_2}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}} $$
where:
- $ \bar{X}_1 $ and $ \bar{X}_2 $ are the sample means of the two groups,
- $ \sigma_1 $ and $ \sigma_2 $ are the population standard deviations of the two groups,
- $ n_1 $ and $ n_2 $ are the sample sizes of the two groups.
Using the formula above, we calculate a z-score of approximately -2.023.
2. Determine the p-value:
Since we are conducting a two-tailed test, we calculate the p-value by finding the probability in both tails of the distribution. The p-value associated with a z-score of -2.023 is approximately 0.0431.
Since the p-value (0.0431) is less than the significance level of 0.05, we reject the null hypothesis. This indicates a statistically significant difference between the average customer satisfaction scores of the two stores.
Hypothesis Testing
This test can be conducted as a right-tailed, left-tailed, or two-tailed test:
- Right-tailed: Tests if Group 1's mean is significantly greater than Group 2's, with hypotheses \( H_0: \bar{X}_1 \leq \bar{X}_2 \) vs. \( H_1: \bar{X}_1 > \bar{X}_2 \). The p-value is calculated as \( P(Z > z) \), where \( Z \) follows the standard normal distribution.
- Left-tailed: Tests if Group 1's mean is significantly less than Group 2's, with hypotheses \( H_0: \bar{X}_1 \geq \bar{X}_2 \) vs. \( H_1: \bar{X}_1 < \bar{X}_2 \). The p-value is calculated as \( P(Z < z) \).
- Two-tailed: Tests if there is any significant difference between the two means, with hypotheses \( H_0: \bar{X}_1 = \bar{X}_2 \) vs. \( H_1: \bar{X}_1 \neq \bar{X}_2 \). The p-value is calculated as \( 2 \times P(Z > |z|) \).
If the p-value is less than the chosen significance level (e.g., 0.05), we reject the null hypothesis in favor of the alternative hypothesis.
How to Obtain the P-Value from the Z-Score
Once the z-score is calculated, we need the corresponding p-value to determine the significance of our result. There are two main ways to obtain the p-value:
- Using Statistical Libraries: Many libraries provide functions to calculate the p-value directly from the z-score. For example, in Python, you can use:
from scipy.stats import norm
p_value = 2 * (1 - norm.cdf(abs(z_score)))
Further Reading
Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.