Rowwise operation in dplyr 1.0.0


I have written a post on rowwise operation of data frame in R a while ago. purrr::pmap() is recommended for rowwise operation in that post, since other methods have their own disadvantages. However, there will be a better rowwise operation support in dplyr 1.0.0 (will be released soon), and it is very intuitive, simple, easy to use.

Basic

library(dplyr)
df <- tibble(x = 1:3, y = 2:4, z = 3:5)
df %>% rowwise() %>% mutate(m = mean(c(x, y, z)))
## # A tibble: 3 x 4
## # Rowwise: 
##       x     y     z     m
##   <int> <int> <int> <dbl>
## 1     1     2     3     2
## 2     2     3     4     3
## 3     3     4     5     4

We can use tidy selection syntax to succinctly select any variables with c_across().

df %>% rowwise() %>% mutate(m = mean(c_across(everything())))
## # A tibble: 3 x 4
## # Rowwise: 
##       x     y     z     m
##   <int> <int> <int> <dbl>
## 1     1     2     3     2
## 2     2     3     4     3
## 3     3     4     5     4
# equal to
df %>% rowwise() %>% mutate(m = mean(c_across(is.numeric)))
## # A tibble: 3 x 4
## # Rowwise: 
##       x     y     z     m
##   <int> <int> <int> <dbl>
## 1     1     2     3     2
## 2     2     3     4     3
## 3     3     4     5     4
df %>% rowwise() %>% mutate(m = mean(c_across(x:z)))
## # A tibble: 3 x 4
## # Rowwise: 
##       x     y     z     m
##   <int> <int> <int> <dbl>
## 1     1     2     3     2
## 2     2     3     4     3
## 3     3     4     5     4

rowwise() is behave somewhat similarly to the grouping variables passed to group_by(), we can preserve variables by rowwise(<var_to_perserve>).

# .before = x means new var `v` should locate before var `x`
df2 <- mutate(df, v = letters[1:3], .before = x) 
df2 %>% rowwise(v) %>% mutate(m = mean(x:z))
## # A tibble: 3 x 5
## # Rowwise:  v
##   v         x     y     z     m
##   <chr> <int> <int> <int> <dbl>
## 1 a         1     2     3     2
## 2 b         2     3     4     3
## 3 c         3     4     5     4

row-wise summary funcitons in base R

For more efficient, we can use row-wise summary functions in base R across() is required for multiple

# use rowMeans 
df %>% mutate(m = rowMeans(across(everything())))
## # A tibble: 3 x 4
##       x     y     z     m
##   <int> <int> <int> <dbl>
## 1     1     2     3     2
## 2     2     3     4     3
## 3     3     4     5     4
# equal to
mutate(df, m = rowMeans(df))
## # A tibble: 3 x 4
##       x     y     z     m
##   <int> <int> <int> <dbl>
## 1     1     2     3     2
## 2     2     3     4     3
## 3     3     4     5     4

Advanced usage

Run a function many times with different arguments

# example from dplyr vignette, rowwise
df <- tribble(
  ~ n, ~ min, ~ max,
    1,     0,     1,
    2,    10,   100,
    3,   100,  1000,
)

# list is required in here, since `mutate` has to return something in length 1
df %>% 
  rowwise() %>% 
  mutate(data = list(runif(n, min, max)))
## # A tibble: 3 x 4
## # Rowwise: 
##       n   min   max data     
##   <dbl> <dbl> <dbl> <list>   
## 1     1     0     1 <dbl [1]>
## 2     2    10   100 <dbl [2]>
## 3     3   100  1000 <dbl [3]>
# we can also use `purrr::pmap()`
df %>% mutate(data = purrr::pmap(., .f = runif))
## # A tibble: 3 x 4
##       n   min   max data     
##   <dbl> <dbl> <dbl> <list>   
## 1     1     0     1 <dbl [1]>
## 2     2    10   100 <dbl [2]>
## 3     3   100  1000 <dbl [3]>

More complicated problems, vary the function being called

# example from dplyr vignette, rowwise
df <- tribble(
   ~rng,     ~params,
   "runif",  list(n = 10), 
   "rnorm",  list(n = 20),
   "rpois",  list(n = 10, lambda = 5),
)

df %>% 
  rowwise() %>% 
  mutate(data = list(do.call(rng, params)))
## # A tibble: 3 x 3
## # Rowwise: 
##   rng   params           data      
##   <chr> <list>           <list>    
## 1 runif <named list [1]> <dbl [10]>
## 2 rnorm <named list [1]> <dbl [20]>
## 3 rpois <named list [2]> <int [10]>
# use purrr::map2, more complicated
df %>% mutate(data = purrr::map2(rng, params, ~ do.call(.x, .y)))
## # A tibble: 3 x 3
##   rng   params           data      
##   <chr> <list>           <list>    
## 1 runif <named list [1]> <dbl [10]>
## 2 rnorm <named list [1]> <dbl [20]>
## 3 rpois <named list [2]> <int [10]>

Fore more details see the vignette in dplyr.


Choyang

Bioinformatics, R enthusiast. Thoughts on reasarch, personal experience and other distractions.

Tags

dplyr package-dev R tidyverse