Select Page

How to Solve R Error in apply: dim(X) must have a positive length

by | Programming, R, Tips

If you want to use the call a function on the data frame or matrix column using apply(), you must use a data frame or matrix as the first argument. If you use a column of the data frame or matrix, you will raise the error: dim(X) must have a positive length.

You can solve this error by passing the dataframe as the argument to apply, for example,

apply(df, 2, sqrt)

This tutorial will go through the error in detail and how to solve it with code examples.


Apply in R

The apply() function returns a vector, array, or list of values obtained by applying a function to the margins of an array or matrix. The syntax for apply() is as follows:

apply(X, MARGIN, FUN, ...)

Arguments

  • X: an array or matrix
  • MARGIN: a vector giving the subscripts to apply the function over. For a matrix, 1 indicates rows, 2 indicates columns, and c(1, 2) indicates rows and columns. If X has named dimnames, MARGIN can be a character vector selecting dimension names.
  • FUN: the function to apply.
  • ...: Optional arguments to FUN

Example

Let’s look at an example of a data frame.

df <- data.frame(veg_sold=c(20, 40, 104, 75, 99, 10, 200),
fruit_sold=c(30, 50, 80, 300, 100, 23, 10),
cake_sold=c(10, 100, 500, 20,450, 100, 900))

df
  veg_sold fruit_sold cake_sold
1       20         30        10
2       40         50       100
3      104         80       500
4       75        300        20
5       99        100       450
6       10         23       100
7      200         10       900

We want to calculate the average amount of cake sold. We attempt to use the apply() function to calculate the mean value in the cake_sold column.

apply(df$cake_sold, 2, mean)

We specify 2 as the second parameter to indicate we want to apply the function along the column. Let’s run the code to see the result:

Error in apply(df$cake_sold, 2, mean) : 
  dim(X) must have a positive length

The error occurs because R expects a data frame or a matrix as the first argument of the apply() function. Instead, we have provided a column. The dim() function is a built-in R function that either sets or returns the dimension of a matrix, array or data frame. A data frame column has a dimension of NULL:

dim(df$cake_sold)
NULL

Whereas the dimension of the data frame df is 7x3:

dim(df)
[1] 7 3

Hence why the error states dim(X) must be a positive length.

Solution #1: Extract Column using c()

We can extract the column cake_sold using the c() function. Let’s look at the revised code:

df[c('cake_sold')]
dim(df[c('cake_sold')])

In the above code, we subset the data frame to get the cake_sold column, which has a dimension of 7x1.

  cake_sold
1        10
2       100
3       500
4        20
5       450
6       100
7       900

[1] 7 1

We can pass this array to the apply() function to calculate the mean cake sold:

apply(df[c('cake_sold')], 2, mean)

Let’s run the code to get the result:

cake_sold 
 297.1429 

If we want to calculate the mean of specific columns, we can pass the column names to the c() function.

apply(df[c('cake_sold', 'veg_sold')], 2, mean)

Let’s run the code to see the result:

cake_sold  veg_sold 
297.14286  78.28571 

Solution #2: Use function without apply()

Alternatively, we can use the mean() function and pass df$cake_sold as the argument without using apply(). Let’s look at the revised code:

mean(df$cake_sold)

Let’s run the code to see the result:

[1] 297.1429

Summary

Congratulations on reading to the end of this tutorial! Generally, this error occurs when you provide a vector as the first argument of the apply function instead of an array or matrix.

For further reading on R related errors, go to the articles: 

Go to the online courses page on R to learn more about coding in R for data science and machine learning.

Have fun and happy researching!

Research Scientist at Moogsoft | + posts

Suf is a research scientist at Moogsoft, specializing in Natural Language Processing and Complex Networks. Previously he was a Postdoctoral Research Fellow in Data Science working on adaptations of cutting-edge physics analysis techniques to data-intensive problems in industry. In another life, he was an experimental particle physicist working on the ATLAS Experiment of the Large Hadron Collider. His passion is to share his experience as an academic moving into industry while continuing to pursue research. Find out more about the creator of the Research Scientist Pod here and sign up to the mailing list here!