## Multilevel regression #

This post is about what people usually mean when they say “heirarchical regression”. The term “heirarchical regression” is thrown around a lot, and it’s one of those confusing terms in statistics that can have different meanings to different people.

They don’t usually mean what statisticians refer to as heirarchical regression

What people usually mean is heirarchical models, or what statisticians call multilevel models

This is the more interesting thing I think

### Rephrasing of (very well-phrased below but need a slightly more mathematical explanation and fewer words) #

The classic example is data from children nested within schools. The dependent variable could be something like math scores, and the predictors a whole host of things measured about the child and the school. Child-level predictors could be things like GPA, grade, and gender. School-level predictors could be things like: total enrollment, private vs. public, mean SES. Because multiple children are measured from the same school, their measurements are not independent. Hierarchical modeling takes that into account.

### Fully pooled: using features from all levels on the individual observation “requires dramatically different ranges for the explanatory variables to produce reliable coefficients” (https://books.google.co.il/books?id=%5FY1SAAAAQBAJ&pg=PA15&lpg=PA15&dq=multilevel+model+elections&source=bl&ots=vF8maGschO&sig=ACfU3U0ZD9lF2J1W02xBaXBW5Hv1UoGoTA&hl=en&sa=X&ved=2ahUKEwigxOeF1sDqAhUisaQKHVXXA50Q6AEwEXoECAoQAQ#v=onepage&q=multilevel%20model%20elections&f=false). #

#### individual level #

• age
• gender
• income quintile
• rural/small town/suburban/city

#### district level #

• number of seats elected in the district (PR)

#### election level #

• number of parties
• number of seats

## Copy about elections I don’t think I’ll use #

An example a little closer to my interests is to consider a vastly simplified version of the 538 model for American presidential elections. As opposed to more traditional “fundamentals” models, which take several country-level features like GDP growth, unemployment, social polarisation and the presence or absence of civil unrest to predict the national popular vote, a multilevel regression can take advantage of the similarities between states to “smooth over” low volume of polling in smaller states, making a poll-based model potentially more accurate, particular in the United States where lots of high-quality polls are conducted regularly.

### Formal setup

If we drastically simplify the task of modelling an election, let’s suppose we want to model the percentage $$y$$ of registered American voters who will vote for Donald Trump in the upcoming 2020 election in each of the fifty states. We have a small list of important features for each state:

• average household income ($\iota$)
• % of white people ($\omega$)
• % of Black people ($\lambda$)
• % of Hispanic people ($\eta$)
• % of religious people ($\rho$)
• % of registered voters who voted for Donald Trump in 2016 ($\tau$)

Then we can run a simple linear regression model to obtain coefficients $\beta_0,\beta_1,\beta_2,\beta_3,\beta_4,\beta_5,\beta_6$ such that for each state,

$y = \beta_0 + \beta_1\iota + \beta_2\omega + \beta_3\lambda + \beta_4\eta + \beta_5\rho + \beta_6\tau + \epsilon$

where $$\epsilon$$ represents some normally distributed error (of course, there are a number of assumptions we are making about the data here).

Problem with this example is

• there isn’t any intuitive reason to expect the regions to have an impact like the physicians do in the example PDF
• we aren’t using polls so are essentially back to a “fundamentals” model that I am deriding
• need to think a bit more here, read a few more examples

There is a common grouping of American states into four regions: the Northeast, the Midwest, the South and the West.