lect01 Next Lecture

lect01, Tue 03/31

Introductions and Course Overview

Lecture 1

Starts about 10 min into the video

Course Info

Course websites

Textbook / references


No exams.




In-class group work

Will be split into groups of about 3 to introduce yourself and discuss the question that was posed.

Timestamp in the video 32:36

Programming Languages for Data Science

Who is this class for?

Potential List of Topics


Taking computational / conceptional approach instead of the rigorous mathematical treatment

What is a data scientist?

T vs -shaped

What is data science?

Domain expertise helps to know which tools to use and when.

Pokemon-card-style collection of data science explanation visualizations

Learning from Data


(null hypothesis): “All swans are white.”


: “Not all swans are white.”

(null hypothesis): “The ivory-billed woodpecker is extinct.”

Can I be sure? Maybe I have been missing it.

Induction and Evidence

Inference: “Black swans are rare.” Fraction of black swans in the world?

The role of models

Statistical Inference

Given facts about the world, what might I see?

Learn the facts about the world

“Inverse probability”

DS 100 philosophy

4 steps:

  1. Raw data is not information (pandas)
  2. Information is not knowledge (EDA)
    • visualization (Altair)
  3. Knowledge is not understanding (domain expertise)
  4. Understanding is not wisdom (Ethical data science, consequences, data privacy)

Wisdom and Data Science

Questions to ask yourself

Mark Twain’s quote about statistics.


UC Berkeley Gender Bias case (1973)

What possible truths are consistent with this information? 1:10:11?

“Simpson’s paradox”