Comparing pipes: Base-R |> vs {magrittr} %>%

We compare the R native pipe and the {magrittr} pipe.
Author

Albert Rapp

Published

January 12, 2025

Beginners are sometimes confused by the fact that

So in today’s video, I want to compare the two and show you the strengths and weaknesses of each one. Let’s dive in.

Keyboard shortcut

Whatever pipe you use, you should definitely use the RStudio shortcut ctrl + shift + M. This is much quicker than writing it out. By default, this will throw the {magrittr} pipe. But you can change that in the settings.

Simple function chaining

The big advantage of the base-R pipe is that it can easily chain together a couple of functions whether any packages are loaded or not.

runif(100) |> round() |> mean()
## [1] 0.46

The same doesn’t work with the {magrittr} pipe because I have to load the package first.

runif(100) %>% round() %>%  mean()
## Error in runif(100) %>% round() %>% mean(): could not find function "%>%"

But if I do load something like the Tidyverse that contains {magrittr} it works fine.

library(tidyverse) 
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
runif(100) %>% round() %>% mean()
## [1] 0.52

Form strictness

The nice thing about the {magrittr} pipe is that it isn’t as strict as the base-R pipe. For example, {magrittr} allows you to forget function calls and just use the function name.

runif(100) %>% round # works
runif(100) |> round  # Error function call with () is enforced
## Error: The pipe operator requires a function call as RHS (<text>:2:15)

Standard scenario

I don’t think the strictness is much of a disadvantage, though. In most cases (at least in my 90% of pipe use cases), you’ll likely use the pipe with something like mutate() where you specify additional arguments anyway. In that scenario, both pipes work pretty much the same.

dat_with_super_long_name <- tibble(x = 1:3, y = 10:12)
dat_with_super_long_name |> 
  mutate(z = x + y)
## # A tibble: 3 × 3
##       x     y     z
##   <int> <int> <int>
## 1     1    10    11
## 2     2    11    13
## 3     3    12    15
dat_with_super_long_name %>%
  mutate(z = x + y)
## # A tibble: 3 × 3
##       x     y     z
##   <int> <int> <int>
## 1     1    10    11
## 2     2    11    13
## 3     3    12    15

Using a placeholder

Fans of the original {magrittr} pipe will tell you that it’s really cool to use the . operator as a placeholder. Rightfully so, this is a neat feature.

dat_with_super_long_name %>% lm(y ~ x, data = .)
## 
## Call:
## lm(formula = y ~ x, data = .)
## 
## Coefficients:
## (Intercept)            x  
##           9            1

Initially, the base-R pipe could not pull of such a stunt. However, since R 4.3.0. it has a placeholder too.

dat_with_super_long_name |> lm(y ~ x, data = _)
## 
## Call:
## lm(formula = y ~ x, data = dat_with_super_long_name)
## 
## Coefficients:
## (Intercept)            x  
##           9            1

Using multiple placeholders

At this point, fans of the . operator will shout “The dot operator is even cooler. It can be used multiple times!” And they are absolutely right about that. That’s pretty dope.

And for the unenlightened: By wrapping a subsequent function call into {}, you can use the . operator as many times as you’d like over there. In each instance, . will then represent the data that went into {}.

dat_with_super_long_name %>% {plot(.$x, .$y, cex = 3, lwd = 5)}

Sadly, the base pipe cannot do such a thing. Its strictness forbids {}.

## Error: { not allowed
dat_with_super_long_name |> {plot(_$x, _$y, cex = 3, lwd = 5)} 
## Error: function '{' not supported in RHS call of a pipe (<text>:2:29)

A workaround for that would be to

  • define an anonymous function with \(.),
  • wrap that into parentheses, and then
  • call that function.
dat_with_super_long_name |> 
  (\(.) plot(.$x, .$y, cex = 3, lwd = 5))()

Shoutout to Isabella Velásquez’s blog post that taught me about this little trick.

Conditional flows

Now, sometimes people like to use if-statements in their pipe-chains. By combining the {magrittr} pipe with curly brackets and the . operator, this could look like this.

duplicate_flag <- TRUE
duplicates <- tibble(x = 1:3, z = 21:23)
dat_with_super_long_name %>%
  {
    if (duplicate_flag) {
      . |> left_join(duplicates, by = 'x')
    } else {
      .
    }
  } %>%
  summarize(across(everything(), mean))
## # A tibble: 1 × 3
##       x     y     z
##   <dbl> <dbl> <dbl>
## 1     2    11    22
duplicate_flag <- FALSE
duplicates <- tibble(x = 1:3, z = 21:23)
dat_with_super_long_name %>%
  {
    if (duplicate_flag) {
      . |> left_join(duplicates, by = 'x')
    } else {
      .
    }
  } %>%
  summarize(across(everything(), mean))
## # A tibble: 1 × 2
##       x     y
##   <dbl> <dbl>
## 1     2    11

In the past, I have written code like this too. Nowadays, though, I try to break out such things into their own functions. Preferably, one with a descriptive function name.

That way,

  • the base-R pipe can handle this much better,
  • my original chain hopefully stays short, and
  • when I outsource the helper functions to a separate script, the function name hopefully still tells me what it does.
left_join_if_duplicate <- function(dat, duplicate_flag) {
  if (duplicate_flag) {
    dat |> left_join(duplicates, by = 'x') 
  } else {
    dat
  }
}
duplicate_flag <- TRUE
dat_with_super_long_name |> 
  left_join_if_duplicate(duplicate_flag) |> 
  summarize(across(everything(), mean))
## # A tibble: 1 × 3
##       x     y     z
##   <dbl> <dbl> <dbl>
## 1     2    11    22
left_join_if_duplicate <- function(dat, duplicate_flag) {
  if (duplicate_flag) {
    dat |> left_join(duplicates, by = 'x') 
  } else {
    dat
  }
}
duplicate_flag <- FALSE
dat_with_super_long_name |> 
  left_join_if_duplicate(duplicate_flag) |> 
  summarize(across(everything(), mean))
## # A tibble: 1 × 2
##       x     y
##   <dbl> <dbl>
## 1     2    11

Enjoyed this blog post?

Here are three other ways I can help you:

3 Minutes Wednesdays

Every week, I share bite-sized R tips & tricks. Reading time less than 3 minutes. Delivered straight to your inbox. You can sign up for free weekly tips online.

Data Cleaning With R Master Class

This in-depth video course teaches you everything you need to know about becoming better & more efficient at cleaning up messy data. This includes Excel & JSON files, text data and working with times & dates. If you want to get better at data cleaning, check out the course page.

Insightful Data Visualizations for "Uncreative" R Users

This video course teaches you how to leverage {ggplot2} to make charts that communicate effectively without being a design expert. Course information can be found on the course page.