Enter data for two samples to calculate the pooled standard deviation.
Pooled Standard Deviation:
Understanding Pooled Standard Deviation
The pooled standard deviation is a measure that combines the standard deviations of two independent samples into a single value, representing the average spread of values around the mean. It is commonly used in hypothesis testing and effect size calculations where combining data from multiple groups is necessary.
Key Components of Pooled Standard Deviation
- Sample Standard Deviations (\( s_1 \) and \( s_2 \)): The standard deviations of the two independent samples.
- Sample Sizes (\( n_1 \) and \( n_2 \)): The number of observations in each sample.
- Degrees of Freedom: The combined degrees of freedom for both samples, calculated as \( n_1 + n_2 - 2 \).
Formula for Pooled Standard Deviation
The formula for pooled standard deviation is derived from the pooled variance formula:
where:
- \( s_1 \): Standard deviation of sample 1
- \( s_2 \): Standard deviation of sample 2
- \( n_1 \): Sample size of sample 1
- \( n_2 \): Sample size of sample 2
Programmatically Calculating Pooled Standard Deviation
Below are examples for calculating the pooled standard deviation in JavaScript, Python, and R.
1. Using JavaScript (with jStat)
In JavaScript, the jStat library can be used to calculate the pooled standard deviation:
// Define inputs
const sample1 = [10, 15, 14, 18, 20];
const sample2 = [12, 17, 16, 19, 21];
// Calculate variances and sample sizes
const variance1 = jStat.variance(sample1, true); // 14.5
const variance2 = jStat.variance(sample2, true); // 12.5
const n1 = sample1.length;
const n2 = sample2.length;
// Calculate pooled variance
const pooledVariance = ((n1 - 1) * variance1 + (n2 - 1) * variance2) / (n1 + n2 - 2);
// Calculate pooled standard deviation
const pooledStdDev = Math.sqrt(pooledVariance);
console.log(`Pooled Standard Deviation: ${pooledStdDev.toFixed(5)}`);
2. Using Python
In Python, you can calculate the pooled standard deviation using basic statistical operations:
# Define inputs
sample1 = [10, 15, 14, 18, 20]
sample2 = [12, 17, 16, 19, 21]
# Calculate variances and sample sizes
variance1 = sum((x - sum(sample1) / len(sample1))**2 for x in sample1) / (len(sample1) - 1) # 14.5
variance2 = sum((x - sum(sample2) / len(sample2))**2 for x in sample2) / (len(sample2) - 1) # 12.5
n1 = len(sample1)
n2 = len(sample2)
# Calculate pooled variance
pooled_variance = ((n1 - 1) * variance1 + (n2 - 1) * variance2) / (n1 + n2 - 2)
# Calculate pooled standard deviation
pooled_std_dev = pooled_variance**0.5
print(f"Pooled Standard Deviation: {pooled_std_dev:.5f}")
3. Using R
In R, you can calculate the pooled standard deviation as follows:
# Define inputs
sample1 <- c(10, 15, 14, 18, 20)
sample2 <- c(12, 17, 16, 19, 21)
# Calculate variances and sample sizes
variance1 <- var(sample1) # 14.5
variance2 <- var(sample2) # 12.5
n1 <- length(sample1)
n2 <- length(sample2)
# Calculate pooled variance
pooled_variance <- ((n1 - 1) * variance1 + (n2 - 1) * variance2) / (n1 + n2 - 2)
# Calculate pooled standard deviation
pooled_std_dev <- sqrt(pooled_variance)
cat("Pooled Standard Deviation:", round(pooled_std_dev, 5))
Further Reading
Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.