This course will introduce Bayesian reasoning for Data Science on an inferential and predictive framework while eventually emphasizing regression analysis. We will learn how to formulate and implement inference using the prior-to-posterior paradigm.
By the end of the course, students are expected to:
- Use Bayesian reasoning when modelling data.
- Apply Bayesian statistics to regression models.
- Compare and contrast Bayesian and frequentist methods, and evaluate their relative strengths.
- Use appropriate statistical libraries and packages for performing Bayesian inference.
This course occurs during Block 5 in the 2024/25 school year. Course notes can be accessed here. Typically, you should review these notes before each lecture. There are also optional textbook readings we suggest reviewing before each lecture (if possible).
See the lecture learning objectives for a detailed breakdown of lecture-by-lecture learning objectives.
This is an assignment-based course. The following deliverables will determine your course grade:
| Assessment | Weight |
|---|---|
| Lab Assignment 1 | 12% |
| Lab Assignment 2 | 12% |
| Lab Assignment 3 | 12% |
| Lab Assignment 4 | 12% |
| Quiz 1 | 25% |
| Quiz 2 | 25% |
| Lecture Attendance | 2% |
LLMs, such as ChatGPT, can be helpful tools if we use them responsibly. In this course, students are permitted to use these tools to gather more information, review concepts, or brainstorm, and students must cite these tools if they use them for assignment. Having said all this, it is not permitted to write any given assignment via copying and pasting AI-generated responses.
In this course, we will be using Stan as our inference engine along with the R package rstan. If you did not installed the software last term, follow the installation instructions here.
If you have installation troubles, please seek our help as soon as possible! You can also use the #installation channel on Slack.
- ThinkBayes (Python)
- Doing Bayesian Data Analysis in brms and the tidyverse (R)
- Probabilistic Programming and Bayesian Methods for Hackers (Python)
- Introduction to Empirical Bayes: Examples from Baseball Statistics (R)
- Statistical Rethinking: A Bayesian Course with Examples in R and Stan (R)
- Quora: For a non-expert, what is the difference between Bayesian and frequentist approaches?
- Probability concepts explained: Bayesian inference for parameter estimation
- MLE and MAP video from Mike Gelbart's CPSC 340.
- Web apps for visualizing probability distributions: one, another
- How Statisticians Found Air France Flight 447 Two Years After It Crashed Into Atlantic
This course is taught in R (we will follow the tidyverse style guide) and Stan with a reasonable mathematical, statistical, and programming basis. We strongly recommend reviewing the following courses:
- DSCI 551: Descriptive Statistics and Probability for Data Science, for basic statistical and probabilistic concepts, and familiarity with the mathematical notation.
- DSCI 552: Statistical Inference and Computation I, for statistical inference concepts with a frequentist approach.
- DSCI 561: Regression I, for ordinary ordinary least-squares (OLS).
- DSCI 562: Regression II, for generalized linear models (GLMs).
- DSCI 531: Data Visualization I, for plotting tools using the package
ggplot2.
See the general MDS policies.
The course is built upon previous years' materials developed by previous instructors.
© 2025 G. Alexi Rodríguez-Arelis, Hedayat Zarkoob, Michael Gelbart, and Trevor Campbell
Software licensed under the MIT License, non-software content licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) License. See the license file for more information.