library(readr)
nyc_recent_noise <- read_csv("https://data.cityofnewyork.us/resource/erm2-nwe9.csv?complaint_type=Noise%20%2D%20Commercial&$limit=200")
head(nyc_recent_noise)SDS 192: Introduction to Data Science
Lindsay Poirier
Statistical & Data Sciences, Smith College
Fall 2022
200: Success!403: Forbidden404: Not Found500: Internal Server Error502: Bad Gateway![]()
Figure: REST API - Author: Seobility - License: CC BY-SA 4.0]
Base URL is the API Endpoint:https://data.cityofnewyork.us/resource/erm2-nwe9.csv
https://data.cityofnewyork.us/resource/erm2-nwe9.csv
?&$limit= limits the number of rows downloaded to a certain numberhttps://data.cityofnewyork.us/resource/erm2-nwe9.csv?unique_key=10693408
https://data.cityofnewyork.us/resource/erm2-nwe9.json?complaint_type=Obstruction&$limit=100
https://dev.socrata.com/foundry/data.cityofnewyork.us/erm2-nwe9
Internet protocols don’t know how to interpret spaces or other special characters (i.e. non-ASCII), so we replace those characters with special codes that they do recognize:
: %20!: %21": %22%: %25': %27-: %2DThere are many resources online for identifying these.
Rread_csv()dplyr vs. SQLselect()filter()group_by()arrange()head()SELECTWHEREGROUP BYORDER BYLIMITSQL in APIsSQL can be written in the URLs constructed for API callsSoQLSELECT unique_key, created_date, incident_address
WHERE descriptor = 'Pothole'
LIMIT 100
WHERE complaint_type = 'Traffic'
SELECT descriptor, count(*)
GROUP BY descriptor
ORDER BY count DESC