This error occurs when you try to fit a model, and one or more of the variables is a list instead of a vector. You can solve this error by converting the list to a vector using the unlist()
function. For example,
x <- list(2, 5, 5, 6, 7, 11, 2, 3, 5) y <- c(4, 5, 6, 10, 3, 4, 5, 9, 5) model <- lm(y ~ unlist(x)) summary(model)
This tutorial will go through the error in detail and how to solve it with code examples.
Example
Let’s look at an example to reproduce the error. We will define two variables height
, containing the heights of 10 subjects in metres and weight
, containing the weights of the 10 subjects in kilograms. Next, we will attempt to fit a linear regression model using the lm()
function.
# Define variable height <- list(1.8, 1.5, 1.7, 1.6, 1.9, 2.0, 1.75, 1.55, 1.4, 1.83) weight <- c(99, 44, 85, 80, 104, 120, 93, 56, 43, 78) # Attempt to fit linear regression model model <- lm(weight ~ height)
Let’s run the code to see what happens:
Error in model.frame.default(formula = weight ~ height, drop.unused.levels = TRUE) : invalid type (list) for variable 'height'
The error occurs because the lm()
function expects the variables to be vectors, and the height
variable is a list.
Solution
We can solve the error by converting the list variable to a vector using the unlist()
function. Let’s look at the revised code:
# Define variables height <- list(1.8, 1.5, 1.7, 1.6, 1.9, 2.0, 1.75, 1.55, 1.4, 1.83) weight <- c(99, 44, 85, 80, 104, 120, 93, 56, 43, 78) # Attempt to fit linear regression model model <- lm(weight ~ unlist(height)) summary(model)
Let’s run the code to get the model output:
Call: lm(formula = weight ~ unlist(height)) Residuals: Min 1Q Median 3Q Max -18.374 -3.858 1.682 6.130 12.918 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -136.68 29.35 -4.658 0.00163 ** unlist(height) 127.35 17.14 7.431 7.39e-05 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 9.722 on 8 degrees of freedom Multiple R-squared: 0.8735, Adjusted R-squared: 0.8577 F-statistic: 55.23 on 1 and 8 DF, p-value: 7.394e-05
If we are using multiple predictor variables that are list objects, we have to unlist each one before fitting the regression model. For example,
height <- list(1.8, 1.5, 1.7, 1.6, 1.9, 2.0, 1.75, 1.55, 1.4, 1.83) waist <- list(32, 18, 36, 34, 30, 32, 28, 16, 24, 30) weight <- c(99, 44, 85, 80, 104, 120, 93, 56, 43, 78) model <- lm(weight ~ unlist(height) + unlist(waist)) summary(model)
Call: lm(formula = weight ~ unlist(height) + unlist(waist)) Residuals: Min 1Q Median 3Q Max -17.732 -2.523 2.501 4.386 7.808 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -129.2845 25.3357 -5.103 0.001395 ** unlist(height) 106.2132 18.0777 5.875 0.000615 *** unlist(waist) 1.0215 0.5128 1.992 0.086602 . --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 8.302 on 7 degrees of freedom Multiple R-squared: 0.9193, Adjusted R-squared: 0.8962 F-statistic: 39.85 on 2 and 7 DF, p-value: 0.0001496
Summary
Congratulations on reading to the end of this tutorial!
For further reading on R-related errors, go to the articles:
- How to Solve R Error in sort.int(x, na.last = na.last, decreasing = decreasing, …) : ‘x’ must be atomic
- How to Solve R Error: Arguments imply differing number of rows
- How to Solve R Error as.Date.numeric(x) : ‘origin’ must be supplied
Go to the online courses page on R to learn more about coding in R for data science and machine learning.
Have fun and happy researching!
Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.