Pooled Variance Calculator

Pooled Variance Calculator

Enter data for two samples to calculate the pooled variance.

Pooled Variance:

Understanding Pooled Variance

The pooled variance is a method used to estimate the variance of two independent samples with potentially different sample sizes. It is calculated as a weighted average of the variances from each sample and is commonly used in statistical tests like the independent t-test.

Key Components of Pooled Variance

  • Sample Variances (\( s_1^2 \) and \( s_2^2 \)): The variances of the two independent samples.
  • Sample Sizes (\( n_1 \) and \( n_2 \)): The number of observations in each sample.
  • Degrees of Freedom: The combined degrees of freedom for both samples, calculated as \( n_1 + n_2 - 2 \).

Formula for Pooled Variance

The formula for pooled variance is:

\[ s_p^2 = \frac{(n_1 - 1) \cdot s_1^2 + (n_2 - 1) \cdot s_2^2}{n_1 + n_2 - 2} \]

where:

  • \( s_1^2 \): Variance of sample 1
  • \( s_2^2 \): Variance of sample 2
  • \( n_1 \): Sample size of sample 1
  • \( n_2 \): Sample size of sample 2

Programmatically Calculating Pooled Variance

Below are examples for calculating the pooled variance in JavaScript, Python, and R.

1. Using JavaScript (with jStat)

In JavaScript, the jStat library can be used to calculate the pooled variance:

// Define inputs
const sample1 = [10, 15, 14, 18, 20]; // Sample 1 values
const sample2 = [12, 17, 16, 19, 21]; // Sample 2 values

// Calculate variances and sample sizes
const variance1 = jStat.variance(sample1, true);
const variance2 = jStat.variance(sample2, true);
const n1 = sample1.length;
const n2 = sample2.length;

// Calculate pooled variance
const pooledVariance = ((n1 - 1) * variance1 + (n2 - 1) * variance2) / (n1 + n2 - 2);

console.log(`Pooled Variance: ${pooledVariance.toFixed(5)}`);

2. Using Python

In Python, you can calculate the pooled variance using basic statistical operations:

# Define inputs
sample1 = [10, 15, 14, 18, 20]
sample2 = [12, 17, 16, 19, 21]

# Calculate variances and sample sizes
variance1 = sum((x - sum(sample1) / len(sample1))**2 for x in sample1) / (len(sample1) - 1)
variance2 = sum((x - sum(sample2) / len(sample2))**2 for x in sample2) / (len(sample2) - 1)
n1 = len(sample1)
n2 = len(sample2)

# Calculate pooled variance
pooled_variance = ((n1 - 1) * variance1 + (n2 - 1) * variance2) / (n1 + n2 - 2)

print(f"Pooled Variance: {pooled_variance:.5f}")

3. Using R

In R, you can calculate the pooled variance as follows:

# Define inputs
sample1 <- c(10, 15, 14, 18, 20)
sample2 <- c(12, 17, 16, 19, 21)

# Calculate variances and sample sizes
variance1 <- var(sample1)
variance2 <- var(sample2)
n1 <- length(sample1)
n2 <- length(sample2)

# Calculate pooled variance
pooled_variance <- ((n1 - 1) * variance1 + (n2 - 1) * variance2) / (n1 + n2 - 2)

cat("Pooled Variance:", round(pooled_variance, 5), "\n")

Further Reading

Profile Picture
Senior Advisor, Data Science | [email protected] | + posts

Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.