How to Solve R Error: Could not find function “%”

by | Programming, R, Tips

The pipe operator, %>%, is a special function available under the magrittr package that allows you to pass the result of one function/argument to the other one in sequence. To use the pipe operator, you need to install and load the magrittr package.

install.packages("magrittr")
library(magrittr)

This tutorial will go through how to solve the error with a code example.


Example

The pipe operator allows us to pass an intermediate result onto the next function. Consider the following example of calling multiple mathematical functions on a numerical vector.

x <- c(0.1, 0.3, 0.6, 0.9, 0.5, 0.1, 0.01, 0.8, 0.9)

x %>% log() %>%
    diff() %>%
    exp() %>%
    round(1)

In the above code, we compute the logarithm of x and then pass the resultant vector to the diff function. We use the diff function to find the difference between two consecutive pairs in the vector and pass the result to the exp() function. We then round the result of the exponential function call to one decimal place.

Let’s run the code to see what happens:

Error in x %>% log() %>% diff() %>% exp() %>% round(1) : 
  could not find function "%>%"

The error occurs because we did not load the magrittr package, which provides the pipe operator.

Solution

We can install and load the magrittr package as follows.

install.packages("magrittr")
library(magrittr)

Then we can perform the series of computations on the numerical vector and see the result.

x <- c(0.1, 0.3, 0.6, 0.9, 0.5, 0.1, 0.01, 0.8, 0.9)

x %>% log() %>%
    diff() %>%
    exp() %>%
    round(1)
[1]  3.0  2.0  1.5  0.6  0.2  0.1 80.0  1.1

Example #2

Let’s look at a second example where we want to perform some data manipulation on the mtcars dataset. Specifically, we want to filter out cars with number of carburettors (carb) less than 1, then group by number of cylinders (cyl), then create a new data frame with the average miles per gallon (mpg) for each number of cylinders.

mtcars %>%
     filter(carb > 1)%>%
     group_by(cyl) %>%
     summarize(Avg_mpg = mean(mpg))

Let’s run the code to see what happens:

Error in mtcars %>% filter(carb > 1) %>% group_by(cyl) %>% summarize(Avg_mpg = mean(mpg)) : 
  could not find function "%>%"

We can solve the pipe operator error by loading the magrittr package. Let’s look at the revised code:

library(magrittr)
mtcars %>%
     filter(carb > 1)%>%
     group_by(cyl) %>%
     summarize(Avg_mpg = mean(mpg))

Let’s run the code to see what happens:

Error in summarize(., Avg_mpg = mean(mpg)) : 
  could not find function "summarize"

The pipe operator error disappears, but we have a new error stating R could not find the function “summarize“.

Solution #1: Load dplyr

We can solve the missing pipe operator and summarize function errors by installing and loading the dplyr. The dplyr package provides a consistent set of verbs to help solve the most common data manipulation challenges. One of the verbs that dplyr provides is summarize/summarise. Let’s look at the revised code:

install.packages("dplyr")
library(dplyr)

mtcars %>%
     filter(carb > 1)%>%
     group_by(cyl) %>%
     summarize(Avg_mpg = mean(mpg))

Let’s run the code to see the resultant data frame.

# A tibble: 3 × 2
    cyl Avg_mpg
  <dbl>   <dbl>
1     4    25.9
2     6    19.7
3     8    15.1

Solution #2: Load tidyverse

We can also solve this error by installing and loading tidyverse, which provides a set of packages for data science, including dplyr and margrittr. Loading tidyverse is preferable if the code’s focus is to manipulate, explore and visualize data.

Let’s look at the revised code:

install.packages("tidyverse")

library(tidyverse)

mtcars %>%
     filter(carb > 1)%>%
     group_by(cyl) %>%
     summarize(Avg_mpg = mean(mpg))

Let’s look at the modified code:

── Attaching packages ─────────────────────────────────────────────── tidyverse 1.3.1 ──
✔ ggplot2 3.3.6     ✔ purrr   0.3.4
✔ tibble  3.1.6     ✔ dplyr   1.0.9
✔ tidyr   1.2.0     ✔ stringr 1.4.0
✔ readr   2.1.2     ✔ forcats 0.5.1
── Conflicts ────────────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()

# A tibble: 3 × 2
    cyl Avg_mpg
  <dbl>   <dbl>
1     4    25.9
2     6    19.7
3     8    15.1

We successfully retrieved the data frame containing average miles per gallon for the different number of cylinders.

Note that when we load tidyverse, R tells us which packages it is attaching at which conflicts arise. We can see that the dplyr filter and lag functions mask the stats functions filter and lag functions.

Summary

Congratulations on reading to the end of this tutorial!

Go to the online courses page on R to learn more about coding in R for data science and machine learning.

Have fun and happy researching!

Research Scientist at Moogsoft | + posts

Suf is a research scientist at Moogsoft, specializing in Natural Language Processing and Complex Networks. Previously he was a Postdoctoral Research Fellow in Data Science working on adaptations of cutting-edge physics analysis techniques to data-intensive problems in industry. In another life, he was an experimental particle physicist working on the ATLAS Experiment of the Large Hadron Collider. His passion is to share his experience as an academic moving into industry while continuing to pursue research. Find out more about the creator of the Research Scientist Pod here and sign up to the mailing list here!

Follow the Research Scientist Pod on Social media!