Day One: Introductions

SDS 192: Introduction to Data Science

Lindsay Poirier
Statistical & Data Sciences, Smith College

Spring 2022

What is data science?: Common view

  • interdisciplinary field combining computer science, mathematics/statistics, and domain expertise to extract meaningful information from unstructured data points

Hckum, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons

What is data science?: My view

  • interdisciplinary field combining computer science, mathematics/statistics, and domain expertise to extract meaningful information from unstructured data points
  • also involves art, design, hermeneutics, communication, and ability to grapple with ethical dilemmas

Hckum, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons

Case Study 1: ACLU Fights Discriminatory Housing

  • American Civil Liberties Union employs data scientists to produce insights regarding discriminatory laws and practices
  • Findings are presented in courts, legislatures, and public reports
  • In this study, they use public data to show that excluding people with criminal records from housing can be viewed as a violation of the US Fair Housing Act.

Case Study 2: EPA Tracks Environmental Injustice

  • Environmental Protection Agency hires data scientists to produce insights regarding environmental health risks
  • Findings implicate environmental policies, funding allocations, and legal actions against states and industries
  • This tool, visualizes environmental and demographic indicators to highlight communities experiencing environmental injustices.

Case Study 3: Geena Davis Institute Studies Gender Biases in Films

  • Geena Davis Institute collaborated with University of Southern California’s Signal Analysis and Interpretation Laboratory (SAIL)
  • Developed a machine learning tool to measure representation of diverse groups in films by studying screen time and speaking

Topics covered in this course

  • data visualization
  • data wrangling
  • programming with data
  • mapping
  • data retrieval
  • data science infrastructures and workflows
  • data science ethics

Who is the professor? Why is an anthropologist teaching data science?

  • Please call me Lindsay (preferred), Professor Poirier, or Dr. Poirier
  • Assistant Professor of SDS and cultural anthropologist
  • Previously Assistant Professor of Science and Technology Studies at UC Davis
  • Lab Manager at BetaNYC
  • M.S./Ph.D. in Science and Technology Studies from Rensselaer Polytechnic Institute
  • B.S. in Information Technology and Web Science from Rensselaer Polytechnic Institute
  • Dancing, crafting, cooking, re-watching the same TV series over and over again.
  • I have a very spunky dog Madison.

Exercise

Demonstration: Find the non-STEM major in the class for which students with that major on average drink the most cups of coffee per day. Repeat for STEM majors.

Coding can be intimidating!

  • Coding is like learning a new language. When you are first learning it, it all feels completely unfamiliar. I will work to support you in building the vocabulary and syntax to code in R.
  • Coding can be frustrating. I regularly lose hours of my day in trying to find bugs in my code. I will work to give you resources and skills to navigate coding frustrations.
  • Coding social environments have historically been exclusionary. I will work to reduce barriers to coding in whatever ways I can.

Prepping for this Class

  • Navigating Course Website
  • Standards Grading
  • Perusall
  • Slack

For Friday

  • Install Slack Desktop and set notifications
  • Complete Syllabus Quiz
  • Fill out first day of class questionnaire
  • Let me know if you will be using a Chromebook, asap
  • Bring charged tech to class on Friday