How to Solve R Error in apply: dim(X) must have a positive length

If you want to use the call a function on the data frame or matrix column using apply(), you must use a data frame or matrix as the first argument. If you use a column of the data frame or matrix, you will raise the error: dim(X) must have a positive length.

You can solve this error by passing the dataframe as the argument to apply, for example,

apply(df, 2, sqrt)

This tutorial will go through the error in detail and how to solve it with code examples.

Apply in R

The apply() function returns a vector, array, or list of values obtained by applying a function to the margins of an array or matrix. The syntax for apply() is as follows:

apply(X, MARGIN, FUN, ...)

Arguments

X: an array or matrix
MARGIN: a vector giving the subscripts to apply the function over. For a matrix, 1 indicates rows, 2 indicates columns, and c(1, 2) indicates rows and columns. If X has named dimnames, MARGIN can be a character vector selecting dimension names.
FUN: the function to apply.
...: Optional arguments to FUN

Example

Let’s look at an example of a data frame.

df <- data.frame(veg_sold=c(20, 40, 104, 75, 99, 10, 200),
fruit_sold=c(30, 50, 80, 300, 100, 23, 10),
cake_sold=c(10, 100, 500, 20,450, 100, 900))

df

  veg_sold fruit_sold cake_sold
1       20         30        10
2       40         50       100
3      104         80       500
4       75        300        20
5       99        100       450
6       10         23       100
7      200         10       900

We want to calculate the average amount of cake sold. We attempt to use the apply() function to calculate the mean value in the cake_sold column.

apply(df$cake_sold, 2, mean)

We specify 2 as the second parameter to indicate we want to apply the function along the column. Let’s run the code to see the result:

Error in apply(df$cake_sold, 2, mean) : 
  dim(X) must have a positive length

The error occurs because R expects a data frame or a matrix as the first argument of the apply() function. Instead, we have provided a column. The dim() function is a built-in R function that either sets or returns the dimension of a matrix, array or data frame. A data frame column has a dimension of NULL:

dim(df$cake_sold)

NULL

Whereas the dimension of the data frame df is 7x3:

dim(df)

[1] 7 3

Hence why the error states dim(X) must be a positive length.

Solution #1: Extract Column using c()

We can extract the column cake_sold using the c() function. Let’s look at the revised code:

df[c('cake_sold')]
dim(df[c('cake_sold')])

In the above code, we subset the data frame to get the cake_sold column, which has a dimension of 7x1.

  cake_sold
1        10
2       100
3       500
4        20
5       450
6       100
7       900

[1] 7 1

We can pass this array to the apply() function to calculate the mean cake sold:

apply(df[c('cake_sold')], 2, mean)

Let’s run the code to get the result:

cake_sold 
 297.1429

If we want to calculate the mean of specific columns, we can pass the column names to the c() function.

apply(df[c('cake_sold', 'veg_sold')], 2, mean)

Let’s run the code to see the result:

cake_sold  veg_sold 
297.14286  78.28571

Solution #2: Use function without apply()

Alternatively, we can use the mean() function and pass df$cake_sold as the argument without using apply(). Let’s look at the revised code:

mean(df$cake_sold)

Let’s run the code to see the result:

[1] 297.1429

Summary

Congratulations on reading to the end of this tutorial! Generally, this error occurs when you provide a vector as the first argument of the apply function instead of an array or matrix.

For further reading on R related errors, go to the articles:

Go to the online courses page on R to learn more about coding in R for data science and machine learning.

Have fun and happy researching!