Select Page

How to Solve R Error: Invalid type (list) for variable

by | Programming, R, Tips

This error occurs when you try to fit a model, and one or more of the variables is a list instead of a vector. You can solve this error by converting the list to a vector using the unlist() function. For example,

x <- list(2, 5, 5, 6, 7, 11, 2, 3, 5)
y <- c(4, 5, 6, 10, 3, 4, 5, 9, 5)

model <- lm(y ~ unlist(x))

summary(model)

This tutorial will go through the error in detail and how to solve it with code examples.


Table of contents

Example

Let’s look at an example to reproduce the error. We will define two variables height, containing the heights of 10 subjects in metres and weight, containing the weights of the 10 subjects in kilograms. Next, we will attempt to fit a linear regression model using the lm() function.

# Define variable

height <- list(1.8, 1.5, 1.7, 1.6, 1.9, 2.0, 1.75, 1.55, 1.4, 1.83)
weight <- c(99, 44, 85, 80, 104, 120, 93, 56, 43, 78)

# Attempt to fit linear regression model

model <- lm(weight ~ height)

Let’s run the code to see what happens:

Error in model.frame.default(formula = weight ~ height, drop.unused.levels = TRUE) : 
  invalid type (list) for variable 'height'

The error occurs because the lm() function expects the variables to be vectors, and the height variable is a list.

Solution

We can solve the error by converting the list variable to a vector using the unlist() function. Let’s look at the revised code:

# Define variables
height <- list(1.8, 1.5, 1.7, 1.6, 1.9, 2.0, 1.75, 1.55, 1.4, 1.83)
weight <- c(99, 44, 85, 80, 104, 120, 93, 56, 43, 78)

# Attempt to fit linear regression model

model <- lm(weight ~ unlist(height))

summary(model)

Let’s run the code to get the model output:

Call:
lm(formula = weight ~ unlist(height))

Residuals:
    Min      1Q  Median      3Q     Max 
-18.374  -3.858   1.682   6.130  12.918 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)     -136.68      29.35  -4.658  0.00163 ** 
unlist(height)   127.35      17.14   7.431 7.39e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 9.722 on 8 degrees of freedom
Multiple R-squared:  0.8735,	Adjusted R-squared:  0.8577 
F-statistic: 55.23 on 1 and 8 DF,  p-value: 7.394e-05

If we are using multiple predictor variables that are list objects, we have to unlist each one before fitting the regression model. For example,

height <- list(1.8, 1.5, 1.7, 1.6, 1.9, 2.0, 1.75, 1.55, 1.4, 1.83)
waist <- list(32, 18, 36, 34, 30, 32, 28, 16, 24, 30)
weight <- c(99, 44, 85, 80, 104, 120, 93, 56, 43, 78)

model <- lm(weight ~ unlist(height) + unlist(waist))

summary(model)
Call:
lm(formula = weight ~ unlist(height) + unlist(waist))

Residuals:
    Min      1Q  Median      3Q     Max 
-17.732  -2.523   2.501   4.386   7.808 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)    -129.2845    25.3357  -5.103 0.001395 ** 
unlist(height)  106.2132    18.0777   5.875 0.000615 ***
unlist(waist)     1.0215     0.5128   1.992 0.086602 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 8.302 on 7 degrees of freedom
Multiple R-squared:  0.9193,	Adjusted R-squared:  0.8962 
F-statistic: 39.85 on 2 and 7 DF,  p-value: 0.0001496

Summary

Congratulations on reading to the end of this tutorial!

For further reading on R-related errors, go to the articles: 

Go to the online courses page on R to learn more about coding in R for data science and machine learning.

Have fun and happy researching!