Bayesian Measurement Models
MIT | 17.S952 | Fall 2023 | Th 9–11 | Prof. Devin Caughey
Last updated: October 29, 2023
Overview
This course covers quantitative measurement from a Bayesian perspective. It focuses on the specification of measurement models linking observed data (i.e., manifest indicators) to unobserved constructs (i.e., latent variables) of interest. For estimation of these models, we will rely primarily on the Bayesian software environment Stan, as called from R, though we will occasionally touch on other R-based methods. The goal is to get students comfortable specifying and estimating “bespoke” measurement models tailored for particular applications. The course applies this basic framework to a large range of problems and topics, including hierarchical models, factor analysis, item response theory, latent class analysis, ecological inference, network data, and text analysis. Each is covered only in enough depth to provide a sense of what a Bayesian approach to the problem might look like. A solid command of generalized linear models and the theory of likelihood and Bayesian inference (i.e., the material in 17.804) is preferred, but the only prerequisites are 17.800 and 17.802 or their equivalent.
Contact
| Instructor | Email address | Office | Office hours |
|---|---|---|---|
| Devin Caughey | caughey@mit.edu | E53-463 | Thursdays 11–12 |
Course website: https://canvas.mit.edu/courses/21390
Materials
Texts
Readings are an important component of the course. The lectures assume detailed familiarity with the assigned texts, so before each session make sure to give them a close read. Please read them in the order in which they appear on the syllabus.
Required
We will read large parts of the following texts. All other readings will be posted on the course website or can be freely accessed online.
Suggested
We will read selections from the following texts, but each is a valuable resource that at some point you may want to read in its entirety.
Software
You will make extensive use of the following programs and packages. Some have a lengthy and somewhat cumbersome installation process, so please get started on installing them as soon as you can.
- R (https://cran.r-project.org)
- Stan (https://mc-stan.org)
- cmdstanr (https://mc-stan.org/cmdstanr)
- rstan (http://mc-stan.org/rstan/)
- brms (https://github.com/paul-buerkner/brms)
Assignments
Grades in this class are based on three components:
- Article presentation and general class participation (10%): In addition to participating productively in class discussions throughout the term, each student is expected to give one presentation summarizing an applied paper and relating it to the topics of that session.
- Homework exercises (45%): Students are expected to complete 9 weekly homework exercises, each worth 5% of their course grade. The primary focus of these assignments will be using R and Stan to implement measurement models and then interpreting and assessing the results.
- Research paper (45%): The capstone assignment of this course is a research project that employs methods covered in the class. Co-authoring is permitted but not required. The project involves several components due at the following times:
- Session 5: Submit project idea(s), having identified potential data sources and perhaps performed some exploratory analysis (5% of course grade).
- Session 9: With any coauthors, submit brief description of proposed project, including a descriptive analysis of the dataset (summary statistics, plots, etc.) and an explanation of the concepts you propose to measure and the methods you anticipate using to do so (5% of course grade).
- Session 12 or 13: Give a short conference-style presentation of your project to the class (10% of course grade).
- One week after Session 13: Submit final paper, revised to incorporate feedback from presentations (25% of course grade).
Schedule
1. Thursday, September 7: Measurement
Topics
- representational vs. pragmatic measurement
- measurement models
- measurement error, validity, and reliability
- consequences of mismeasurement
- fair measurement
Required readings
Additional resources
- Hand (2016)
2. Thursday, September 14: Bayesian inference
TODO: Install software
Topics
- Fundamentals of Bayesian statistics
- Stan and brms
- The workflow of of Bayesian inference
- Bayesian regression
- Regression as measurement
Required readings
Additional resources
- Jackman (2009), chap. 1–2
3. Thursday, September 21: Latent variables and response models
TODO: Submit problem set #1 (Stan and brms basics)
Topics
- Generalized linear models
- Latent variables
- Bradley–Terry models
Required readings
Additional resources
4. Thursday, September 28: Hierarchical models
TODO: Submit problem set #2 (Bradley–Terry models)
Topics
- Hierarchical priors
- Hierarchical/multilevel GLMS
- Multilevel regression and poststratification (MRP)
Required readings
Additional resources
5. Thursday, October 5: Scale measurement with metrical indicators
TODO: Submit problem set #3 (MRP)
Topics
- Unsupervised measurement
- GLLAMM general factor model
- Bayesian factor analysis
- Principal components
- Identification with one factor
Required readings
Additional resources
6. Thursday, October 12: Scale measurement with categorical indicators
TODO: Submit problem set #4 (factor analysis)
Topics
- Item response theory
- Spatial models of choice
- Mixed factor models
- Identification in multiple dimensions
Required readings
Additional resources
7. Thursday, October 19: Structural models
TODO: Submit problem set #5 (item response theory)
- Hierarchical latent variable models
- Dynamic linear models
- Forecasting
- Estimating effects on latent variables
Required readings
Additional resources
8. Thursday, October 26: Model specification (discussion session)
TODO: Submit problem set #6 (dynamic models)
Topics
- Differential item functioning
- Joint scaling
- Dimension selection
- Latent variables as covariates
- Bayesian casual inference
Required readings
Additional resources
- King et al. (2004)
9. Thursday, November 2: Class measurement
TODO: Submit descriptive analysis of research project
Topics
- Clustering
- Gaussian mixture models
- Latent class models
- Hidden Markov models
Required readings
- Lauderdale (2022), chap. 13 and 14
- https://mc-stan.org/users/documentation/case-studies/Latent_class_case_study.html
- Fowler et al. (2023)
- Broockman and Lauderdale (2023)
Additional resources
- https://mc-stan.org/users/documentation/case-studies/bball-hmm.html
- Park (2010)
- Bertsou and Caramani (2022)
10. Thursday, November 9: Missing and mismeasured data
TODO: Submit problem set #7 (mixture models)
Topics
- Errors in variables
- Propagating uncertainty
- Multiple (over)imputation
- Joint modeling
Required readings
Additional resources
- Knox, Lucas, and Cho (2022)
11. Thursday, November 16: Ecological inference
TODO: Submit problem set #8 (measurement error)
Topics
- The ecological fallacy
- Ecological inference models
- Dynamic and hierarchical EI
- Combining aggregate and individual-level data
Required readings
Additional resources
- Glynn and Wakefield (2010)
Thursday, November 23: NO CLASS (Thanksgiving)
12. Thursday, November 30: Network analysis
Topics
- Network concepts
- Exponential random graph models
- Latent space/factor models
- Additive and multiplicative effects (AMEN) models
Required readings
Additional resources
13. Thursday, December 7: Text analysis
TODO: Submit problem set #9 (AMEN models)
Topics
- Text as data
- Naive Bayes models
- Latent Dirichlet allocation
- Structural topic models
- Word embeddings
Required readings
Additional resources
- Silge and Robinson (2017)