If you try to use the predict() function and the column names in the test data frame do not match those in the data frame used to fit the model, you will raise the error in eval(predvars, data, env): object ‘x’ not found. The ‘x’ will be the column name that does not exist.
You can solve this error by checking the column names of the data frames using names, for example,
names(data)
Then you can rename the columns so that they match, for example:
names(data)[names(data) == 'column_name_to_change'] <- 'new_column_name'
This tutorial will go through the error in detail and how to solve it with code examples.
Example
Let’s look at an example of fitting a linear regression model to some data. First, we will look at the data frame, which contains height and weight measurements for ten people.
data <- data.frame(height=c(176, 200, 134, 150, 160, 180, 140, 190, 145, 155), weight=c(80, 104.7, 47, 55, 62.4, 70, 85, 66, 120, 60)) data
height weight 1 176 80.0 2 200 104.7 3 134 47.0 4 150 55.0 5 160 62.4 6 180 70.0 7 140 85.0 8 190 66.0 9 145 120.0 10 155 60.0
In our example, height is the predictor variable, and weight is the response variable.
Next, we will fit a linear regression model to the data and get the summary view of the model:
model <- lm(weight ~ height, data=data) summary(model)
Call: lm(formula = weight ~ height, data = data) Residuals: Min 1Q Median 3Q Max -21.39 -14.68 -10.41 11.94 49.10 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 37.7907 57.9825 0.652 0.533 height 0.2283 0.3528 0.647 0.536 Residual standard error: 23.64 on 8 degrees of freedom Multiple R-squared: 0.04977, Adjusted R-squared: -0.06901 F-statistic: 0.419 on 1 and 8 DF, p-value: 0.5356
Now that we have a model, we can use it to predict the weights of five test subjects given their heights.
test_data <- data.frame(heights=c(143, 210, 120, 188, 158)) predict(model, newdata = test_data)
Let’s run the code to see what happens:
Error in eval(predvars, data, env) : object 'height' not found
The error occurs because the names for the predictor variable do not match. We can obtain the column names of each data frame using the names()
function:
names(data) names(test_data)
[1] "height" "weight" [1] "heights"
We can see that test_data
has the column name heights, and data
has the column name height.
Solution
We can solve this error by renaming the test_data
column as follows:
names(test_data)[names(test_data) == 'heights'] <- 'height' test_data
height 1 143 2 210 3 120 4 188 5 158
Now that we have the correct column name, we can use the predict()
function to predict the response values given the predictor values:
predict(model, newdata = test_data)
1 2 3 4 5 70.44321 85.74195 65.19141 80.71848 73.86830
We successfully predicted the response values.
Summary
Congratulations on reading to the end of this tutorial!
For further reading on R related errors, go to the articles:
- How to Solve R Error in lm.fit: na/nan/inf
- How to Solve R Error: Subscript out of bounds
- How to Solve R Error in apply: dim(X) must have a positive length
- How to Solve R Error in eval(predvars, data, env): object not found
Go to the online courses page on R to learn more about coding in R for data science and machine learning.
Have fun and happy researching!
Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.