Synthetic data
Can LLMs generate respondent panels that pass standard quality checks? When they do not, where do they fail, and can we measure it before we use them?
Research
Synthetic data generation, opinion prediction, annotator disagreement, and model evaluation. Code and datasets are open source on GitHub and Hugging Face.
Three intertwined questions sit underneath most of what we publish, and most of what we teach.
Can LLMs generate respondent panels that pass standard quality checks? When they do not, where do they fail, and can we measure it before we use them?
How well do open-weights models recover real survey responses across languages, age cohorts, and political contexts, and what drives the gaps?
Building eval sets that survive contamination, treating annotator disagreement as a measurement rather than a problem, and reporting confidence honestly.
Replication code ships with the paper, not six months after. Model checkpoints are released under permissive licences whenever we control the training data.
Code and evaluation tooling for our research projects, including harnesses for running LLMs at survey scale. All repositories are public.
Open on GitHub →Model checkpoints and datasets from our research on synthetic data generation and opinion prediction.
Open on Hugging Face →