Course
LLMs for Social Science
Language models are reshaping how social scientists collect, annotate, and analyze text. But treating them as black-box APIs leaves enormous potential on the table, and introduces risks you cannot diagnose without understanding what happens inside.
This course takes you from the mathematical foundations of word embeddings and transformers through to building autonomous research agents, giving you the conceptual depth and hands-on skills to use LLMs critically and creatively in your own work.
What You Will Learn
Module 1
Foundations
From bag-of-words to embeddings and tokenization, then contextual representations, language modeling, the Transformer, scaling laws, and how this stack shows up in social science workflows—the conceptual spine for the rest of the course.
Module 2
From Models to Tools
Post-training alignment (SFT, RLHF, DPO), prompting from zero-shot to chain-of-thought, reasoning techniques, and how to evaluate models and choose one that fits your task.
Module 3
Deploying for Research
Fine-tuning (including LoRA-style adaptation and encoder heads), moving from notebooks to APIs and self-hosted inference, and rigorous validation—so a classifier works as a measurement instrument you can defend in publication.
Module 4
Extraction, Summarization, and RAG
Faithful extraction and summarization (hallucination, omission, and distortion), then retrieval-augmented generation: chunking, embeddings, retrieval, and grounded answers with provenance checks over your own corpora.
Module 5
Agentic Workflows
Building autonomous research agents with tool use, ReAct patterns, and multi-step orchestration: the frontier of what LLMs can do for your research pipeline.
Prerequisites
Beginner-to-intermediate Python. A Google account for Colab. No prior deep learning or NLP experience required; we build from fundamentals.
Preliminary notebooks →Materials
All exercises are open-source Jupyter notebooks. Clone the repo, open in Colab, and follow along.
GitHub Repository →Begin the Course
Start with Module 1: the mathematical and conceptual foundations that everything else builds on.
Begin →2026 Dates & Locations
Oxford
March 23–27, 2026
5-day intensive. DPIR, University of Oxford.
ESSCA Paris
April 7, 2026
1-day workshop.
EUI Florence
April 20–21, 2026
2-day workshop. European University Institute.