If you want to use the call a function on the data frame or matrix column using apply(), you must use a data frame or matrix as the first argument. If you use a column of the data frame or matrix, you will raise the error: dim(X) must have a positive length.
You can solve this error by passing the dataframe as the argument to apply, for example,
apply(df, 2, sqrt)
This tutorial will go through the error in detail and how to solve it with code examples.
Apply in R
The apply()
function returns a vector, array, or list of values obtained by applying a function to the margins of an array or matrix. The syntax for apply()
is as follows:
apply(X, MARGIN, FUN, ...)
Arguments
X
: an array or matrixMARGIN
: a vector giving the subscripts to apply the function over. For a matrix, 1 indicates rows, 2 indicates columns, andc(1, 2)
indicates rows and columns. IfX
has nameddimnames
,MARGIN
can be a character vector selecting dimension names.FUN
: the function to apply....
: Optional arguments toFUN
Example
Let’s look at an example of a data frame.
df <- data.frame(veg_sold=c(20, 40, 104, 75, 99, 10, 200), fruit_sold=c(30, 50, 80, 300, 100, 23, 10), cake_sold=c(10, 100, 500, 20,450, 100, 900)) df
veg_sold fruit_sold cake_sold 1 20 30 10 2 40 50 100 3 104 80 500 4 75 300 20 5 99 100 450 6 10 23 100 7 200 10 900
We want to calculate the average amount of cake sold. We attempt to use the apply()
function to calculate the mean value in the cake_sold
column.
apply(df$cake_sold, 2, mean)
We specify 2
as the second parameter to indicate we want to apply the function along the column. Let’s run the code to see the result:
Error in apply(df$cake_sold, 2, mean) : dim(X) must have a positive length
The error occurs because R expects a data frame or a matrix as the first argument of the apply()
function. Instead, we have provided a column. The dim() function is a built-in R function that either sets or returns the dimension of a matrix, array or data frame. A data frame column has a dimension of NULL
:
dim(df$cake_sold)
NULL
Whereas the dimension of the data frame df is 7x3
:
dim(df)
[1] 7 3
Hence why the error states dim(X)
must be a positive length.
Solution #1: Extract Column using c()
We can extract the column cake_sold
using the c()
function. Let’s look at the revised code:
df[c('cake_sold')] dim(df[c('cake_sold')])
In the above code, we subset the data frame to get the cake_sold
column, which has a dimension of 7x1
.
cake_sold 1 10 2 100 3 500 4 20 5 450 6 100 7 900 [1] 7 1
We can pass this array to the apply()
function to calculate the mean cake sold:
apply(df[c('cake_sold')], 2, mean)
Let’s run the code to get the result:
cake_sold 297.1429
If we want to calculate the mean of specific columns, we can pass the column names to the c()
function.
apply(df[c('cake_sold', 'veg_sold')], 2, mean)
Let’s run the code to see the result:
cake_sold veg_sold 297.14286 78.28571
Solution #2: Use function without apply()
Alternatively, we can use the mean()
function and pass df$cake_sold
as the argument without using apply()
. Let’s look at the revised code:
mean(df$cake_sold)
Let’s run the code to see the result:
[1] 297.1429
Summary
Congratulations on reading to the end of this tutorial! Generally, this error occurs when you provide a vector as the first argument of the apply function instead of an array or matrix.
For further reading on R related errors, go to the articles:
- How to Solve R Error: $ operator is invalid for atomic vectors
- How to Solve R Error: Subscript out of bounds
Go to the online courses page on R to learn more about coding in R for data science and machine learning.
Have fun and happy researching!
Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.