Functions

SDS 192: Introduction to Data Science

Lindsay Poirier
Statistical & Data Sciences, Smith College

Fall 2022

For Today

  • Quiz 2 Posted!
  • Don’t forget about the Study Plan Assessment
  • Writing Functions
  • Code Along

Why write functions?

pit %>%
  filter(State == "AK" & Year == 2015) |>
  summarize(Total = sum(Count))
pit %>%
  filter(State == "AL" & Year == 2015) |>
  summarize(Total = sum(Count))
pit %>%
  filter(State == "AR" & Year == 2015) |>
  summarize(Total = sum(Count))

Why write functions?

  • Reduces the amount of code to write
  • Lowers the chances of errors in code
  • Supports reproducibility

Functions

  • Statements organized to perform a specific task
  • Take arguments as inputs
    • Arguments can be required or optional
  • Return a value or set of values as outputs
    • Default value is output of last line of function if not specified

User-defined Functions

function_name <- function(arg1, arg2) {
  
  x <- #Some code goes here referencing arg1 and arg2
  
  return(x)
}
calculate_difference <- function(arg1, arg2) {
  
  x <- arg1 - arg2
  
  return(x)
}

Calling Functions

  • This is the same way you’d call functions built-in to R
function_name(value_for_arg1, value_for_arg2)
calculate_difference(7, 3)
[1] 4

User-defined Functions

  • state_name is a required argument here.
calculate_state_total <- function(state_name) {
  
  x <- pit %>%
    filter(State == state_name & Year == 2015) |>
    summarize(Total = sum(Count))
  
  return(x)
}
calculate_state_total("AL")

Making Arguments Optional

  • Because we provide a deafult value for year, it is optional in the function call.
calculate_state_total <- function(state_name, year = 2020) {
  
  x <- pit %>%
    filter(State == state_name & Year == year) |>
    summarize(Total = sum(Count))
  
  return(x)
}
calculate_state_total("AL")
calculate_state_total <- function(state_name, year = 2020) {
  
  x <- pit %>%
    filter(State == state_name & Year == year) |>
    summarize(Total = sum(Count))
  
  return(x)
}
calculate_state_total("AL", 2019)

Naming Arguments

  • Naming helps to differentiate between arguments. Order matters if arguments aren’t named!
calculate_state_total <- function(state_name, year = 2020) {
  
  x <- pit %>%
    filter(State == state_name & Year == year) |>
    summarize(Total = sum(Count))
  
  return(x)
}
calculate_state_total(state_name = "AL", year = 2019)