This calculator finds the Total Sum of Squares (TSS), Explained Sum of Squares (ESS), and Residual Sum of Squares (RSS) for a linear regression model, and verifies that \( TSS = ESS + RSS \). The fitted line and TSS as a dashed line are also displayed.
To use the calculator, provide a list of values for the predictor and the response, ensuring they are the same length, and then click the "Calculate and Plot" button.
Total Sum of Squares (TSS):
Explained Sum of Squares (ESS):
Residual Sum of Squares (RSS):
Verification (TSS = ESS + RSS):
Total Sum of Squares (TSS), Explained Sum of Squares (ESS), and Residual Sum of Squares (RSS) Explanation
The Total Sum of Squares (TSS) represents the total variation in the response variable \(Y\). It is calculated as the sum of the squared differences between each observed value \(Y_i\) and the mean of the observed values \(\bar{Y}\).
Key Components
- Predictor Variable (\(X\)): The independent variable used to predict the response.
- Response Variable (\(Y\)): The dependent variable that is being predicted.
- Fitted Value (\(\hat{Y}\)): The predicted value of \(Y\) for a given \(X\), based on the linear regression model.
- Mean Value (\(\bar{Y}\)): The average of the observed values of \(Y\).
Total Sum of Squares (TSS), ESS, and RSS
The relationship between these quantities is:
TSS represents the total variation in \(Y\), ESS represents the part explained by the model, and RSS represents the part that remains unexplained. The verification checks if \( TSS = ESS + RSS \).
Caveats and Conditions
- Linear Assumption: These calculations assume a linear relationship between the predictor \(X\) and the response \(Y\). If the relationship is non-linear, the model may not fit well, and the sums of squares may not provide meaningful insights.
- Outliers: Outliers can significantly impact the values of TSS, ESS, and RSS. Large outliers may cause the model to fit poorly, even if \( TSS = ESS + RSS \) holds true.
- Overfitting: If the model is too complex (e.g., too many predictors), it may explain the data too well, leading to high ESS but a small RSS. This could result in overfitting, where the model performs well on the given data but fails to generalize to new data.
Further Reading
Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.