ggplot()SDS 192: Introduction to Data Science
Lindsay Poirier
Statistical & Data Sciences, Smith College
Fall 2022
ggplot()This dataset comes from Pioneer Valley Data and documents estimates of population characteristics for each municipality in the Pioneer Valley.
ggplotggplot2ggplot2 is included in the Tidyverse, which you installed in SDS 100ggplot2 in your environment.ggplot() functionggplot() takes two arguments:
aes() (short for aesthetics)x = and y =)ggplot() functionR what variables to plot, but we didn’t indicate how to plot them.ggplot call. Examples:
geom_bar()geom_point()+ signggplot(data = hampshire_census_data,
aes(x = COMMUNITY,
y = CEN_EARLYED)) +
geom_col() +
coord_flip() + # Flipping the x and y coordinates here makes the labels more legible.
theme_minimal() +
labs(title = "Hampshire County Early Education Enrollment Rates, 2018",
x = "Enrollment Rate for 3-4 yr old",
y = "Municipality in Hampshire County, MA")aes() functionaes() functionWe add visual cues to the plot in the
aes()call
# Add visual cue for size and attribute for transparency
ggplot(data = pioneer_valley_census_data,
aes(x = COUNTY, y = CEN_WORKERS, size = CEN_HOUSEHOLDS)) +
geom_point(alpha = 0.2) +
coord_flip() +
labs(title = "Number of Workers Age 16+ in Pioneer Valley, MA Municipalities, 2018", x = "County", y = "Workers Age 16+", size = "Households")No. There are cheatsheets. The ggplot2() cheatsheet is linked here.