Multilevel regression with poststratification (MRP) #
Multilevel regression with poststratification, commonly known as MRP, is a statistical technique commonly used in cutting-edge election forecasting models. It involves using demographic data from highly accurate polls (usually censuses) that can be assumed to be true measures of the target population, and using these to reweight, for example, national level polls to a state or constituency level.
Use political polls as a running example throughout.
Multilevel regression #
I wrote a post about this here, and elections are indeed a very natural use case for the multilevel regression idea, even before trying poststratficiation. In the post I looked at a response variable describing the cost of a certain medical treatment, looking at the patient’s age but also at the doctor who treated them as a categorical variable. In the example of elections, national polls may be conducted and key demographic variables may be recorded alongside voting intention data. This can lead to insights like “80% of Black American voters plan to vote for Biden but only 45% of white American voters”.
This is the process of using census data to “stratify” the national polls down to a state, or constituency level. For example you may know a certain American state’s population is 40% Black and 30% white, and use this (and potentially other features) to infer the state’s Biden / Trump margin overall.
Links and resources #
- A really detailed description of the model British market research firm Survation used to very accurately predict the results of the 2019 UK parliamentary elections.
- Very good survey paper with a running example of a US Presidential election.
- An example using Australian national polls, census data and poststratification to build state-level estimates.