ggplot()
SDS 192: Introduction to Data Science
Lindsay Poirier
Statistical & Data Sciences, Smith College
Fall 2022
ggplot()
This dataset comes from Pioneer Valley Data and documents estimates of population characteristics for each municipality in the Pioneer Valley.
ggplot
ggplot2
ggplot2
is included in the Tidyverse, which you installed in SDS 100ggplot2
in your environment.ggplot()
functionggplot()
takes two arguments:
aes()
(short for aesthetics)x =
and y =
)ggplot()
functionR
what variables to plot, but we didn’t indicate how to plot them.ggplot
call. Examples:
geom_bar()
geom_point()
+
signggplot(data = hampshire_census_data,
aes(x = COMMUNITY,
y = CEN_EARLYED)) +
geom_col() +
coord_flip() + # Flipping the x and y coordinates here makes the labels more legible.
theme_minimal() +
labs(title = "Hampshire County Early Education Enrollment Rates, 2018",
x = "Enrollment Rate for 3-4 yr old",
y = "Municipality in Hampshire County, MA")
aes()
functionaes()
functionWe add visual cues to the plot in the
aes()
call
# Add visual cue for size and attribute for transparency
ggplot(data = pioneer_valley_census_data,
aes(x = COUNTY, y = CEN_WORKERS, size = CEN_HOUSEHOLDS)) +
geom_point(alpha = 0.2) +
coord_flip() +
labs(title = "Number of Workers Age 16+ in Pioneer Valley, MA Municipalities, 2018", x = "County", y = "Workers Age 16+", size = "Households")
No. There are cheatsheets. The ggplot2()
cheatsheet is linked here.