Data Science Applications and Analysis (UCSB, S20)

Link to Lecture Slides

Link to Lecture Recording

Lecture Notes:

lecture date notes ready? description reading
2020-03-31 lect01 true Introductions and Course Overview
2020-04-02 lect02 true Data Life Cycle
2020-04-07 lect03 true The Data Science Lifecycle and Sampling
2020-04-09 lect04 false Pandas
2020-04-14 lect05 true Pandas and Question formulation

num ready? description assigned due

num ready? description assigned due

In this course, we will explore the data science lifecycle: question formulation, data collection & cleaning, exploratory data analysis & visualization, statistical inference and prediction, and decision-making.

Instructors: Professors Kate Kharitonova (CS) and Alex Franks (PSTAT)

Prerequisites: PSTAT 120A, Math 4A, and knowledge of Python (at a minimum equivalent of CS 8, INT 5, PSTAT 10).

Catalog description: Overview and use of data science tools in Python for data retrieval, analysis, visualization, reproducible research and automated report generation. Case studies will illustrate practical use of these tools. This new course will focus on concepts that are relevant for data science by using some of the popular software tools in this area. Doing data science is more than using isolated methods. Creatively using a collection of concepts and domain knowledge is emphasized to clean, transform, analyze, and present data. Concepts in data ethics and privacy will also be discussed. Case studies will illustrate real usage scenarios.

Programming experience: This course is designed for students with a solid conceptual understanding of programming primitives (e.g., flow control, functions, 1D and 2D arrays, data types) and is comfortable in Python or in at least one programming or scripting language (C/C++, R, Python, etc.).

Software tools: Many software tools are used for data science. Tools we will use for this course include (but not limited to)

Learning by doing will require software documentation, experimenting by trial-and-error, and lots of debugging. We are looking for self-motivated students with diverse interests in data science.

Where does this data science course fit in with the existing courses?

Link to this page: