Quantitative gerrymandering
Posted with : Data science
According to FiveThirtyEight, Democrats need to win the popular vote by a 5.6% margin in order to take 50% of the seats in the House of Representatives this November. This margin is partly due to Republicans having an advantage since they are in control of the House already, partly because Republicans have a natural advantage due to geographic reasons, and partly due to gerrymandering. How much of it is accidental and how much is due gerrymandering? What techniques do we have to identify and mitigate gerrymandering? These questions were part of the focus of last week’s Quantitative Redistricting workshop in Duke University, organized by SAMSI. I had the fortune to attend this workshop and I am writing summary of some of the interesting ideas presented.
Definition of gerrymandering.
Something I learned during Daniel Magleby’s talk is that there exists a legal definition of gerrymandering (signed off by Justice Scalia):
“A name given to the process of dividing a state or other territory into the authorized civil or political divisions, but with such a geographical arrangements to accomplish a sinister or unlawful purpose, as, for instance, to secure a majority for a given political party in districts where the result would be otherwise if they were divided according to obvious natural lines…”
The key elements of this definition are
-
sinister purpose,
-
partisan bias,
-
that wouldn’t occur naturally.
An immediate question is what would happen naturally, in the absence of partisan bias? This is the fundamental question we will partially address.
Map ensembles for detecting gerrymandering
Depending on the state, admissible maps should have districts satisfying a list of desirable properties including
-
same population (one person one vote),
-
contiguity and compactness,
-
preservation of communities of interest,
-
compliance with the Voting Rights Act.
If one could consider all the maps satisfying properties (1)-(4), one could say how admissible maps look like typically, in terms of different metrics such as election outcomes, or distribution of political affiliation of voters per district. Unfortunately the set of all admissible maps is too large to enumerate. In order to find an approximate solution, researchers construct an ensemble of maps (say 25,000 maps) that behaves as a representative sample of the (huge and inscrutable) set of all admissible maps.
Research groups of Geg Herschlag and Jonathan Mattingly, Kosuke Imai, Jowei Chen use different strategies to produce such map ensembles, and use them to spot “atypical” (gerrymandered?) maps.
These are examples from a map ensemble for North Carolina Congressional Districts, from Jonathan Mattingly’s talk:
Disclaimer: The ensemble maps are used for comparison, they do not suggest to enact any of these maps.
Once they have the distribution of typical maps, they can analyze properties of the current map in this context. Mattingly’s group produce very nice histograms showing, for a specific election, how many seats each party gets under their set of maps. The following plot, from Mattingly’s group paper shows that under typical maps, Democrats would have gotten between 4 and 6 seats for the North Carolina Congress by just looking into the 2016 election votes. However, the plan enacted in 2016 would have given them only 3 seats, which is very atypical.
Detecting gerrymandering without ensembles
Wes Pegden and collaborators developed a test that statistically proves that a map is an outlier without the need to sample the set of admissible maps. The idea is to just look a the map’s neighbors using MCMC. Intuitively, if something is typical and it’s modified by a sequence of reversible changes, it is very unlikely that the changes modify it in a consistent way (one expects some ups and downs since the changes are random and reversible).
The signature of gerrymandering
A common misconception is that gerrymandering can be identified by looking for bizarre looking districts, like Goofy kicking Donald Duck, or the the 1st and 12th congressional districts of the 2012 North Carolina Congressional plan (that were ruled unconstitutional by SCOTUS in Common Cause v. Rucho).
However, bizarre looking districts can be justifiable and compact looking districts can be strongly biased, like the 2016 North Carolina Congressional plan.
Gerrymandering is based on the principle of cracking and packing (when the favored party loses, it loses by a large margin, and when it wins, it wins by a safe but not overwhelming majority) producing a disparity in the amount of wasted votes. In fact, some disparity in the margins of wins by different parties happen naturally due to inefficient spatial distribution. Jonathan Rodden’s talk analyzed the consequences of asymmetric spatial distribution.
Such property can be nicely visualized in the following plot (also by Mattingly’s group).
In the x axis one has the North Carolina Congressional Districts ordered from least Democratic voters to most. In the y axis one has the fraction of democratic voters. There is a clear jump that shows that shows that Democrats had been unnaturally packed in districts 1, 4 and 12. The map proposed by the judges do not exhibit such property.
Some maps exhibit more cracking than packing. How to identify them separately was the subject of Magleby’s talk.
Unbiased maps
So far we are able to spot biased maps. But can we produce good maps? Some ideas were presented of how to produce good maps, using techniques based on k-means clustering and power diagrams (Philip Klein), optimal transport (Néstor Guillen), 2-party protocols (Gerdus Benade) and even machine learning (Soledad Villar).
However, the redistricting problem is a political problem with social consequences. It’s not just a mathematical problem (even though it has inspired a lot of cool math, see Dustin Mixon’s talk).
Understanding the problem from a mathematical point of view gives tools to analyze current maps and to compare them to typical or geographically optimal maps. The underlying optimization problem can be hard, but the hardest question is not a math one. What is the objective we should be optimizing for?
One of my favorite talks of the workshops was the talk by Will Adler, not only because he has very cool open source software to draw and visualize districts, but also because he is working for an initiative to move the redistricting task from the legislature to an independent commission in Virginia.