If you try to subset a data frame without using a comma, you will raise the error: undefined columns selected. The syntax for subsetting a data frame is:
dataframe[rows_to_subset, columns_to_subset]
To solve this error, you need to use a comma after the rows you want to subset, even if you want rows from all columns. For example,
data[data$col1>5, col1]
,
selects rows in column 1 with values greater than 5.
This tutorial will go through the error in detail and how to solve it with code examples.
Table of contents
Example: Error in data.frame undefined columns selected
Let’s look at an example with a data frame with two variables.
dat <- data.frame(x = c(0, 1, 2, 3, 4, 5), y = c(11, 2, 5, 7, 9, 3)) dat
x y 1 0 11 2 1 2 3 2 5 4 3 7 5 4 9 6 5 3
Let’s try to select the rows in column y that are greater than 5:
dat[dat$y>5]
Error in `[.data.frame`(dat, dat$y > 5) : undefined columns selected
R raises the error because we did not use a comma after the row subset expression to inform R which columns we want to select.
Solution: Use a comma for the row and column expressions
We need to add a comma after the row subset expression to solve this error. Let’s look at the revised code:
dat[dat$y>5, "y"]
Note that we have to put the column name in quotes. Let’s run the code to see the result:
[1] 11 7 9
If we want to return values from all columns, we can leave the space after the comma blank.
dat[dat$y>5, ]
x y 1 0 11 4 3 7 5 4 9
If we know the total number of columns, we can use an equivalent command dat[dat$y>5, 1:2]. Let’s look at the revised code:
dat[dat$y>5, 1:2]
x y 1 0 11 4 3 7 5 4 9
We successfully retrieved the rows where at least one of the values is greater than five.
Summary
Congratulations on reading to the end of this tutorial!
For further reading on R related errors, go to the articles:
- How to Solve R Error in fix.by(by.y, y): ‘by’ must specify a uniquely valid column
- How to Solve R Error: object of type ‘closure’ is not subsettable
- How to Solve R Error missing value where TRUE/FALSE needed
Go to the online courses page on R to learn more about coding in R for data science and machine learning.
Have fun and happy researching!
Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.