# How to Solve R Error in apply: dim(X) must have a positive length

by | Programming, R, Tips

If you want to use the call a function on the data frame or matrix column using apply(), you must use a data frame or matrix as the first argument. If you use a column of the data frame or matrix, you will raise the error: dim(X) must have a positive length.

You can solve this error by passing the dataframe as the argument to apply, for example,

`apply(df, 2, sqrt)`

This tutorial will go through the error in detail and how to solve it with code examples.

## Apply in R

The `apply()` function returns a vector, array, or list of values obtained by applying a function to the margins of an array or matrix. The syntax for `apply()` is as follows:

`apply(X, MARGIN, FUN, ...)`

Arguments

• `X`: an array or matrix
• `MARGIN`: a vector giving the subscripts to apply the function over. For a matrix, 1 indicates rows, 2 indicates columns, and `c(1, 2)` indicates rows and columns. If `X` has named `dimnames`, `MARGIN` can be a character vector selecting dimension names.
• `FUN`: the function to apply.
• `...`: Optional arguments to `FUN`

## Example

Let’s look at an example of a data frame.

```df <- data.frame(veg_sold=c(20, 40, 104, 75, 99, 10, 200),
fruit_sold=c(30, 50, 80, 300, 100, 23, 10),
cake_sold=c(10, 100, 500, 20,450, 100, 900))

df```
```  veg_sold fruit_sold cake_sold
1       20         30        10
2       40         50       100
3      104         80       500
4       75        300        20
5       99        100       450
6       10         23       100
7      200         10       900```

We want to calculate the average amount of cake sold. We attempt to use the `apply()` function to calculate the mean value in the `cake_sold` column.

`apply(df\$cake_sold, 2, mean)`

We specify `2` as the second parameter to indicate we want to apply the function along the column. Let’s run the code to see the result:

```Error in apply(df\$cake_sold, 2, mean) :
dim(X) must have a positive length```

The error occurs because R expects a data frame or a matrix as the first argument of the `apply()` function. Instead, we have provided a column. The dim() function is a built-in R function that either sets or returns the dimension of a matrix, array or data frame. A data frame column has a dimension of `NULL`:

`dim(df\$cake_sold)`
`NULL`

Whereas the dimension of the data frame df is `7x3`:

`dim(df)`
` 7 3`

Hence why the error states `dim(X)` must be a positive length.

### Solution #1: Extract Column using c()

We can extract the column `cake_sold` using the `c()` function. Let’s look at the revised code:

```df[c('cake_sold')]
dim(df[c('cake_sold')])```

In the above code, we subset the data frame to get the `cake_sold` column, which has a dimension of `7x1`.

```  cake_sold
1        10
2       100
3       500
4        20
5       450
6       100
7       900

 7 1```

We can pass this array to the `apply()` function to calculate the mean cake sold:

`apply(df[c('cake_sold')], 2, mean)`

Let’s run the code to get the result:

```cake_sold
297.1429 ```

If we want to calculate the mean of specific columns, we can pass the column names to the `c()` function.

`apply(df[c('cake_sold', 'veg_sold')], 2, mean)`

Let’s run the code to see the result:

```cake_sold  veg_sold
297.14286  78.28571 ```

## Solution #2: Use function without apply()

Alternatively, we can use the `mean()` function and pass `df\$cake_sold` as the argument without using `apply()`. Let’s look at the revised code:

`mean(df\$cake_sold)`

Let’s run the code to see the result:

` 297.1429`

## Summary

Congratulations on reading to the end of this tutorial! Generally, this error occurs when you provide a vector as the first argument of the apply function instead of an array or matrix.

For further reading on R related errors, go to the articles:

Go to the online courses page on R to learn more about coding in R for data science and machine learning.

Have fun and happy researching!

##### Suf
Research Scientist at | + posts

Suf is a research scientist at Moogsoft, specializing in Natural Language Processing and Complex Networks. Previously he was a Postdoctoral Research Fellow in Data Science working on adaptations of cutting-edge physics analysis techniques to data-intensive problems in industry. In another life, he was an experimental particle physicist working on the ATLAS Experiment of the Large Hadron Collider. His passion is to share his experience as an academic moving into industry while continuing to pursue research. Find out more about the creator of the Research Scientist Pod here and sign up to the mailing list here!