Pipes and functions in R

MACS 30500
University of Chicago

October 10, 2016

Piping

Using the diamonds dataset, calculate the average price for each cut of “I” colored diamonds.

Intermediate steps

diamonds_1 <- filter(diamonds, color == "I")
diamonds_2 <- group_by(diamonds_1, cut)
diamonds_3 <- summarize(diamonds_2, price = mean(price))

Overwrite the original

diamonds_t <- diamonds
diamonds_t <- filter(diamonds_t, color == "I")
diamonds_t <- group_by(diamonds_t, cut)
diamonds_t <- summarize(diamonds_t, price = mean(price))

Function composition

summarize(
  group_by(
    filter(diamonds, color == "I"),
    cut
  ),
  price = mean(price)
)

Piping

diamonds %>%
  filter(color == "I") %>%
  group_by(cut) %>%
  summarize(price = mean(price))

Rescale function

rescale01 <- function(x) {
  rng <- range(x, na.rm = TRUE)
  (x - rng[1]) / (rng[2] - rng[1])
}
  • Name
  • Arguments
  • Body

Testing rescale01()

rescale01(c(0, 5, 10))
## [1] 0.0 0.5 1.0
rescale01(c(-10, 0, 10))
## [1] 0.0 0.5 1.0
rescale01(c(1, 2, 3, NA, 5))
## [1] 0.00 0.25 0.50   NA 1.00

Analyze this

pythagorean <- function(a, b){
  hypotenuse <- sqrt(a^2 + b^2)
  return(hypotenuse)
}

How to use a function

# print the output of the function
pythagorean(a = 3, b = 4)
## [1] 5
# save the output as a new object
(tri_c <- pythagorean(a = 3, b = 4))
## [1] 5
# what happens to the hypotenuse from inside the function?
pythagorean(a = 3, b = 4)
## [1] 5
hypotenuse
## Error in eval(expr, envir, enclos): object 'hypotenuse' not found

Conditional execution

if (condition) {
  # code executed when condition is TRUE
} else {
  # code executed when condition is FALSE
}

Conditional execution

if (this) {
  # do that
} else if (that) {
  # do something else
} else {
  # 
}

Conditional execution and cut()

diamonds %>%
  select(carat) %>%
  mutate(carat_autobin = cut(carat, breaks = 5),
         carat_manbin = cut(carat,
                            breaks = c(0, 1, 2, 3, 6),
                            labels = c("Small", "Medium", "Large", "Huge")))
## # A tibble: 53,940 × 3
##    carat carat_autobin carat_manbin
##    <dbl>        <fctr>       <fctr>
## 1   0.23  (0.195,1.16]        Small
## 2   0.21  (0.195,1.16]        Small
## 3   0.23  (0.195,1.16]        Small
## 4   0.29  (0.195,1.16]        Small
## 5   0.31  (0.195,1.16]        Small
## 6   0.24  (0.195,1.16]        Small
## 7   0.24  (0.195,1.16]        Small
## 8   0.26  (0.195,1.16]        Small
## 9   0.22  (0.195,1.16]        Small
## 10  0.23  (0.195,1.16]        Small
## # ... with 53,930 more rows

Calculate the sum of squares of two variables

(x <- c(2, 4, 6))
## [1] 2 4 6
(y <- c(1, 3, 5))
## [1] 1 3 5
sum_of_squares(x, y)
## [1]  5 25 61

Fizzbuzz

It takes a single number as input. If the number is divisible by three, it returns “fizz”. If it’s divisible by five it returns “buzz”. If it’s divisible by three and five, it returns “fizzbuzz”. Otherwise, it returns the number.

Modular division

5 / 3
## [1] 1.666667
5 %% 3
## [1] 2

Output

fizzbuzz(3)
## [1] "fizz"
fizzbuzz(5)
## [1] "buzz"
fizzbuzz(15)
## [1] "fizzbuzz"
fizzbuzz(4)
## [1] 4