*The pipe operator, %>%, is a special function available under the magrittr package that allows you to pass the result of one function/argument to the other one in sequence. To* use the pipe operator,

*you need to install and load the*

`magrittr`

package.install.packages("magrittr") library(magrittr)

*This tutorial will go through how to solve the error with a code example.*

## Example

The pipe operator allows us to pass an intermediate result onto the next function. Consider the following example of calling multiple mathematical functions on a numerical vector.

x <- c(0.1, 0.3, 0.6, 0.9, 0.5, 0.1, 0.01, 0.8, 0.9) x %>% log() %>% diff() %>% exp() %>% round(1)

In the above code, we compute the logarithm of `x`

and then pass the resultant vector to the diff function. We use the diff function to find the difference between two consecutive pairs in the vector and pass the result to the `exp()`

function. We then round the result of the exponential function call to one decimal place.

Let’s run the code to see what happens:

Error in x %>% log() %>% diff() %>% exp() %>% round(1) : could not find function "%>%"

The error occurs because we did not load the `magrittr`

package, which provides the pipe operator.

### Solution

We can install and load the `magrittr`

package as follows.

install.packages("magrittr") library(magrittr)

Then we can perform the series of computations on the numerical vector and see the result.

x <- c(0.1, 0.3, 0.6, 0.9, 0.5, 0.1, 0.01, 0.8, 0.9) x %>% log() %>% diff() %>% exp() %>% round(1)

[1] 3.0 2.0 1.5 0.6 0.2 0.1 80.0 1.1

## Example #2

Let’s look at a second example where we want to perform some data manipulation on the mtcars dataset. Specifically, we want to filter out cars with number of carburettors (`carb`

) less than 1, then group by number of cylinders (`cyl`

), then create a new data frame with the average miles per gallon (`mpg`

) for each number of cylinders.

mtcars %>% filter(carb > 1)%>% group_by(cyl) %>% summarize(Avg_mpg = mean(mpg))

Let’s run the code to see what happens:

Error in mtcars %>% filter(carb > 1) %>% group_by(cyl) %>% summarize(Avg_mpg = mean(mpg)) : could not find function "%>%"

We can solve the pipe operator error by loading the `magrittr`

package. Let’s look at the revised code:

library(magrittr) mtcars %>% filter(carb > 1)%>% group_by(cyl) %>% summarize(Avg_mpg = mean(mpg))

Let’s run the code to see what happens:

Error in summarize(., Avg_mpg = mean(mpg)) : could not find function "summarize"

The pipe operator error disappears, but we have a new error stating R could not find the function “`summarize`

“.

### Solution #1: Load dplyr

We can solve the missing pipe operator and summarize function errors by installing and loading the `dplyr`

. The dplyr package provides a consistent set of verbs to help solve the most common data manipulation challenges. One of the verbs that `dplyr`

provides is `summarize`

/`summarise`

. Let’s look at the revised code:

install.packages("dplyr") library(dplyr) mtcars %>% filter(carb > 1)%>% group_by(cyl) %>% summarize(Avg_mpg = mean(mpg))

Let’s run the code to see the resultant data frame.

# A tibble: 3 × 2 cyl Avg_mpg <dbl> <dbl> 1 4 25.9 2 6 19.7 3 8 15.1

### Solution #2: Load tidyverse

We can also solve this error by installing and loading `tidyverse`

, which provides a set of packages for data science, including dplyr and `margrittr`

. Loading `tidyverse`

is preferable if the code’s focus is to manipulate, explore and visualize data.

Let’s look at the revised code:

install.packages("tidyverse") library(tidyverse) mtcars %>% filter(carb > 1)%>% group_by(cyl) %>% summarize(Avg_mpg = mean(mpg))

Let’s look at the modified code:

── Attaching packages ─────────────────────────────────────────────── tidyverse 1.3.1 ── ✔ ggplot2 3.3.6 ✔ purrr 0.3.4 ✔ tibble 3.1.6 ✔ dplyr 1.0.9 ✔ tidyr 1.2.0 ✔ stringr 1.4.0 ✔ readr 2.1.2 ✔ forcats 0.5.1 ── Conflicts ────────────────────────────────────────────────── tidyverse_conflicts() ── ✖ dplyr::filter() masks stats::filter() ✖ dplyr::lag() masks stats::lag() # A tibble: 3 × 2 cyl Avg_mpg <dbl> <dbl> 1 4 25.9 2 6 19.7 3 8 15.1

We successfully retrieved the data frame containing average miles per gallon for the different number of cylinders.

Note that when we load `tidyverse`

, R tells us which packages it is attaching at which conflicts arise. We can see that the dplyr filter and lag functions mask the stats functions filter and lag functions.

## Summary

Congratulations on reading to the end of this tutorial!

Go to the online courses page on R to learn more about coding in R for data science and machine learning.

Have fun and happy researching!