*If you want to use the call a function on the data frame or matrix column using apply(), you must use a data frame or matrix as the first argument. If you use a column of the data frame or matrix, you will raise the error: dim(X) must have a positive length.*

*You can solve this error by passing the dataframe as the argument to apply, for example,*

`apply(df, 2, sqrt)`

*This tutorial will go through the error in detail and how to solve it with code examples.*

## Apply in R

The `apply()`

function returns a vector, array, or list of values obtained by applying a function to the margins of an array or matrix. The syntax for `apply()`

is as follows:

apply(X, MARGIN, FUN, ...)

**Arguments**

`X`

: an array or matrix`MARGIN`

: a vector giving the subscripts to apply the function over. For a matrix, 1 indicates rows, 2 indicates columns, and`c(1, 2)`

indicates rows and columns. If`X`

has named`dimnames`

,`MARGIN`

can be a character vector selecting dimension names.`FUN`

: the function to apply.`...`

: Optional arguments to`FUN`

## Example

Let’s look at an example of a data frame.

df <- data.frame(veg_sold=c(20, 40, 104, 75, 99, 10, 200), fruit_sold=c(30, 50, 80, 300, 100, 23, 10), cake_sold=c(10, 100, 500, 20,450, 100, 900)) df

veg_sold fruit_sold cake_sold 1 20 30 10 2 40 50 100 3 104 80 500 4 75 300 20 5 99 100 450 6 10 23 100 7 200 10 900

We want to calculate the average amount of cake sold. We attempt to use the `apply()`

function to calculate the mean value in the `cake_sold`

column.

apply(df$cake_sold, 2, mean)

We specify `2`

as the second parameter to indicate we want to apply the function along the column. Let’s run the code to see the result:

Error in apply(df$cake_sold, 2, mean) : dim(X) must have a positive length

The error occurs because R expects a data frame or a matrix as the first argument of the `apply()`

function. Instead, we have provided a column. The dim() function is a built-in R function that either sets or returns the dimension of a matrix, array or data frame. A data frame column has a dimension of `NULL`

:

dim(df$cake_sold)

NULL

Whereas the dimension of the data frame df is `7x3`

:

dim(df)

[1] 7 3

Hence why the error states `dim(X)`

must be a positive length.

### Solution #1: Extract Column using c()

We can extract the column `cake_sold`

using the `c()`

function. Let’s look at the revised code:

df[c('cake_sold')] dim(df[c('cake_sold')])

In the above code, we subset the data frame to get the `cake_sold`

column, which has a dimension of `7x1`

.

cake_sold 1 10 2 100 3 500 4 20 5 450 6 100 7 900 [1] 7 1

We can pass this array to the `apply()`

function to calculate the mean cake sold:

apply(df[c('cake_sold')], 2, mean)

Let’s run the code to get the result:

cake_sold 297.1429

If we want to calculate the mean of specific columns, we can pass the column names to the `c()`

function.

apply(df[c('cake_sold', 'veg_sold')], 2, mean)

Let’s run the code to see the result:

cake_sold veg_sold 297.14286 78.28571

## Solution #2: Use function without apply()

Alternatively, we can use the `mean()`

function and pass `df$cake_sold`

as the argument without using `apply()`

. Let’s look at the revised code:

mean(df$cake_sold)

Let’s run the code to see the result:

[1] 297.1429

## Summary

Congratulations on reading to the end of this tutorial! Generally, this error occurs when you provide a vector as the first argument of the apply function instead of an array or matrix.

For further reading on R related errors, go to the articles:

- How to Solve R Error: $ operator is invalid for atomic vectors
- How to Solve R Error: Subscript out of bounds

Go to the online courses page on R to learn more about coding in R for data science and machine learning.

Have fun and happy researching!