This calculator finds the **Explained Sum of Squares (ESS)** and the **R-squared** value for a linear regression model using values for the predictor and response variables.

To use the calculator, provide a list of values for the predictor and the response, ensuring they are the same length, and then click the "Calculate ESS and R-squared" button.

**Explained Sum of Squares (ESS):**

**R-squared:**

## Explained Sum of Squares (ESS), Total Sum of Squares (TSS), and R-squared

The **Explained Sum of Squares (ESS)** measures the variability of the predicted values from a linear regression model compared to the mean of the observed data.

### Key Components

**Predictor Variable (\( {X} \))**: The independent variable used to predict the response.**Response Variable (\( {Y} \))**: The dependent variable that is being predicted.**Fitted Value (\( \hat{Y} \))**: The predicted value of Y for a given X, based on the linear regression model.

### Explained Sum of Squares (ESS) Formula

The ESS is calculated as the sum of the squared differences between the predicted values and the mean of the observed values:

### Total Sum of Squares (TSS) and R-squared

The **Total Sum of Squares (TSS)** represents the total variability in the response variable \(Y\), while **R-squared** is a normalized measure of how well the model explains the variation in \(Y\). The formula for R-squared is:

### Relationship Between ESS, TSS, and RSS

The **relationship between ESS, TSS, and RSS (Residual Sum of Squares)** is given by:

In this equation, TSS represents the total variability in \(Y\), ESS represents the portion of the variability explained by the model, and RSS represents the unexplained variability.

### R-squared Interpretation

The **R-squared value** ranges from 0 to 1:

- An R-squared value of 1 means the model perfectly explains the variability in the response variable.
- An R-squared value of 0 means the model explains none of the variability in the response variable.

### Caveats and Conditions

**Linear Assumption**: These calculations assume a linear relationship between the predictor \(X\) and response \(Y\). Non-linear relationships may result in a misleading R-squared value.**Overfitting**: A high R-squared value could indicate overfitting when too many predictors are used.**Outliers**: Outliers can disproportionately affect the ESS, TSS, and RSS, leading to a skewed R-squared value.

### Further Reading

Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.