Enter data for two samples to calculate the pooled variance.
Pooled Variance:
Understanding Pooled Variance
The pooled variance is a method used to estimate the variance of two independent samples with potentially different sample sizes. It is calculated as a weighted average of the variances from each sample and is commonly used in statistical tests like the independent t-test.
Key Components of Pooled Variance
- Sample Variances (\( s_1^2 \) and \( s_2^2 \)): The variances of the two independent samples.
- Sample Sizes (\( n_1 \) and \( n_2 \)): The number of observations in each sample.
- Degrees of Freedom: The combined degrees of freedom for both samples, calculated as \( n_1 + n_2 - 2 \).
Formula for Pooled Variance
The formula for pooled variance is:
where:
- \( s_1^2 \): Variance of sample 1
- \( s_2^2 \): Variance of sample 2
- \( n_1 \): Sample size of sample 1
- \( n_2 \): Sample size of sample 2
Programmatically Calculating Pooled Variance
Below are examples for calculating the pooled variance in JavaScript, Python, and R.
1. Using JavaScript (with jStat)
In JavaScript, the jStat library can be used to calculate the pooled variance:
// Define inputs
const sample1 = [10, 15, 14, 18, 20]; // Sample 1 values
const sample2 = [12, 17, 16, 19, 21]; // Sample 2 values
// Calculate variances and sample sizes
const variance1 = jStat.variance(sample1, true);
const variance2 = jStat.variance(sample2, true);
const n1 = sample1.length;
const n2 = sample2.length;
// Calculate pooled variance
const pooledVariance = ((n1 - 1) * variance1 + (n2 - 1) * variance2) / (n1 + n2 - 2);
console.log(`Pooled Variance: ${pooledVariance.toFixed(5)}`);
2. Using Python
In Python, you can calculate the pooled variance using basic statistical operations:
# Define inputs
sample1 = [10, 15, 14, 18, 20]
sample2 = [12, 17, 16, 19, 21]
# Calculate variances and sample sizes
variance1 = sum((x - sum(sample1) / len(sample1))**2 for x in sample1) / (len(sample1) - 1)
variance2 = sum((x - sum(sample2) / len(sample2))**2 for x in sample2) / (len(sample2) - 1)
n1 = len(sample1)
n2 = len(sample2)
# Calculate pooled variance
pooled_variance = ((n1 - 1) * variance1 + (n2 - 1) * variance2) / (n1 + n2 - 2)
print(f"Pooled Variance: {pooled_variance:.5f}")
3. Using R
In R, you can calculate the pooled variance as follows:
# Define inputs
sample1 <- c(10, 15, 14, 18, 20)
sample2 <- c(12, 17, 16, 19, 21)
# Calculate variances and sample sizes
variance1 <- var(sample1)
variance2 <- var(sample2)
n1 <- length(sample1)
n2 <- length(sample2)
# Calculate pooled variance
pooled_variance <- ((n1 - 1) * variance1 + (n2 - 1) * variance2) / (n1 + n2 - 2)
cat("Pooled Variance:", round(pooled_variance, 5), "\n")
Further Reading
Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.