*If you try to use the predict() function and the column names in the test data frame do not match those in the data frame used to fit the model, you will raise the error in eval(predvars, data, env): object ‘x’ not found. The ‘x’ will be the column name that does not exist.*

*You can solve this error by checking the column names of the data frames using names, for example,*

`names(data)`

*Then you can rename the columns so that they match, for example:*

`names(data)[names(data) == 'column_name_to_change'] <- 'new_column_name'`

*This tutorial will go through the error in detail and how to solve it with code examples. *

## Example

Let’s look at an example of fitting a linear regression model to some data. First, we will look at the data frame, which contains height and weight measurements for ten people.

data <- data.frame(height=c(176, 200, 134, 150, 160, 180, 140, 190, 145, 155), weight=c(80, 104.7, 47, 55, 62.4, 70, 85, 66, 120, 60)) data

height weight 1 176 80.0 2 200 104.7 3 134 47.0 4 150 55.0 5 160 62.4 6 180 70.0 7 140 85.0 8 190 66.0 9 145 120.0 10 155 60.0

In our example, height is the predictor variable, and weight is the response variable.

Next, we will fit a linear regression model to the data and get the summary view of the model:

model <- lm(weight ~ height, data=data) summary(model)

Call: lm(formula = weight ~ height, data = data) Residuals: Min 1Q Median 3Q Max -21.39 -14.68 -10.41 11.94 49.10 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 37.7907 57.9825 0.652 0.533 height 0.2283 0.3528 0.647 0.536 Residual standard error: 23.64 on 8 degrees of freedom Multiple R-squared: 0.04977, Adjusted R-squared: -0.06901 F-statistic: 0.419 on 1 and 8 DF, p-value: 0.5356

Now that we have a model, we can use it to predict the weights of five test subjects given their heights.

test_data <- data.frame(heights=c(143, 210, 120, 188, 158)) predict(model, newdata = test_data)

Let’s run the code to see what happens:

Error in eval(predvars, data, env) : object 'height' not found

The error occurs because the names for the predictor variable do not match. We can obtain the column names of each data frame using the `names()`

function:

names(data) names(test_data)

[1] "height" "weight" [1] "heights"

We can see that `test_data`

has the column name heights, and `data`

has the column name height.

### Solution

We can solve this error by renaming the `test_data`

column as follows:

names(test_data)[names(test_data) == 'heights'] <- 'height' test_data

height 1 143 2 210 3 120 4 188 5 158

Now that we have the correct column name, we can use the `predict()`

function to predict the response values given the predictor values:

predict(model, newdata = test_data)

1 2 3 4 5 70.44321 85.74195 65.19141 80.71848 73.86830

We successfully predicted the response values.

## Summary

Congratulations on reading to the end of this tutorial!

For further reading on R related errors, go to the articles:

- How to Solve R Error in lm.fit: na/nan/inf
- How to Solve R Error: Subscript out of bounds
- How to Solve R Error in apply: dim(X) must have a positive length
- How to Solve R Error in eval(predvars, data, env): object not found

Go to the online courses page on R to learn more about coding in R for data science and machine learning.

Have fun and happy researching!

Suf is a research scientist at Moogsoft, specializing in Natural Language Processing and Complex Networks. Previously he was a Postdoctoral Research Fellow in Data Science working on adaptations of cutting-edge physics analysis techniques to data-intensive problems in industry. In another life, he was an experimental particle physicist working on the ATLAS Experiment of the Large Hadron Collider. His passion is to share his experience as an academic moving into industry while continuing to pursue research. Find out more about the creator of the Research Scientist Pod here and sign up to the mailing list here!