lect01 Next Lecture

lect01, Tue 03/31

Introductions and Course Overview

Lecture 1

Starts about 10 min into the video

Course Info

Course websites

Textbook / references

Grades

No exams.

Homework

Labs

Participation

In-class group work

Will be split into groups of about 3 to introduce yourself and discuss the question that was posed.

Timestamp in the video 32:36

Programming Languages for Data Science

Who is this class for?

Potential List of Topics

*

Taking computational / conceptional approach instead of the rigorous mathematical treatment

What is a data scientist?

T vs -shaped

What is data science?

Domain expertise helps to know which tools to use and when.

Pokemon-card-style collection of data science explanation visualizations

Learning from Data

Falsification

(null hypothesis): “All swans are white.”

vs.

: “Not all swans are white.”

(null hypothesis): “The ivory-billed woodpecker is extinct.”

Can I be sure? Maybe I have been missing it.

Induction and Evidence

Inference: “Black swans are rare.” Fraction of black swans in the world?

The role of models

Statistical Inference

Given facts about the world, what might I see?

Learn the facts about the world

“Inverse probability”

DS 100 philosophy

4 steps:

  1. Raw data is not information (pandas)
  2. Information is not knowledge (EDA)
    • visualization (Altair)
  3. Knowledge is not understanding (domain expertise)
  4. Understanding is not wisdom (Ethical data science, consequences, data privacy)

Wisdom and Data Science

Questions to ask yourself

Mark Twain’s quote about statistics.

Discussion

UC Berkeley Gender Bias case (1973)

What possible truths are consistent with this information? 1:10:11?

“Simpson’s paradox”