In this presentation I'll discuss two numerical approaches to solving political boundaries while striving to avoid gerrymandering.
This is mainly an exploration, and since the final results can be finnicky, this analysis should be taken with a grain of salt...
Slides, code, and images are available here if you want to follow along.
or
We're talking about two different numerical methods that obtain "close-enough" solutions.
Closed form solutions aren't always possible to obtain, so we can get one that's numerically close to solving our hard problem.
# stdlib imports
import sys # exits and calls
import os # path manipulation
import argparse # argument handling
import math # exponent - faster than numpy for this
import random # built-in random func
import re # dynamically pull out algo names
import queue # used for SCC labelling
# typing
from typing import Union
# third-party imports
import numpy as np # heavy lifting
import matplotlib.pyplot as plt # visualization
import matplotlib.animation as animation # animation
from moviepy import editor # More gif
from tqdm import tqdm # Progress bars are nice
from halo import Halo # Spinner
How can this be applied to political boundaries?
Assumptions:
D R D R D R R R
D D R D R R R R
D D D R R R R R
D D R R R R D R
R R D D D R R R
R D D D D D R R
R R R D D D D D
D D D D D D R D
Which can be plotted for readability.
Our first big problem is how we find neighbors of a single point. For any (y, x)
pair we can express its neighbors using the following algorithm.
How do we determine valid solutions?
The basic algorithm is as follows.
(y, x)
pair.For both these algorithms we talk about their "value", which in this case is determined with a fitness function.
A fitness function is a particular type of objective function that is used to summarise, as a single figure of merit, how close a given design solution is to achieving the set aims. (wikipedia)
TL;DR a single number that basically tells us how "good" of a solution we have.
Taking a step back from the code and considering the real world, let's think about what we'd ideally like to emphasize in a political districting system.
R
or all D
.R
, we want 50% R
majority districtsWe'd want to avoid gerrymandering
We want all districts to be around the same population size.
Translated to our code these priorities become
R
to D
majority districts matches the ratio of R
to D
in the general population.total_size / num_districts
.This algorithm is very straightforward.
The fill algorithm is also straightforward.
i
, from the list of available districts.i
.Recall:
Simulated Annealing relies on "mutating" solutions via the following algorithm.
Which can be visualized as follows.
The entire process looks like this:
Which has the following final solution.
Recall:
Which can be visualized as follows.
This process looks like this
With corresponding final solution
This is straightforward! After installing the required libraries (check the repository) just run
$ python3.6 ./politicalboundaries.py $FILE_TO_RUN
If you want to dig a little deeper, use the -h
flag to see what it can do, but
here's a short list as well.
.mp4
)README.md
) assets.I want to do more for this project but I'm limited in the time I have. I do have a couple of ideas for next steps however.
def simulated_annealing(system, numdistricts, precision, animate, makegif):
"""
Perform simulated annealing on our system with a series of progressively
improving solutions.
"""
solution = get_good_start(system, numdistricts)
history = [solution] # Keep track of our history
k = 0.25 # Larger k => more chance of randomly accepting
Tvals = np.arange(1, 1e-12, -1.0 / precision)
print(f'Running Simulated Annealing with k={k:0.03f}, alpha={1.0 / precision:0.05f}')
for i, T in tqdm(enumerate(Tvals), total=len(Tvals)):
new_solution = solution.copy() # copy our current solution
new_solution.mutate() # Mutate the copy
# TODO: Speed this up by keeping current value
dv = new_solution.value - solution.value # Look at delta of values
# If it's better, or random chance, we accept it
if dv > 0 or random.random() < math.exp(dv / (k * T)):
solution = new_solution
history.append(new_solution)
solution.count = len(Tvals)
solution.algo = 'Simulated Annealing'
print(solution)
print(solution.summary())
if animate:
animate_history(system.filename, system.matrix,
history, solution.numdistricts,
makegif)
def genetic_algorithm(system, numdistricts, precision, animate, makegif):
"""
Use a genetic algorithm to find a good solution to our district problem
"""
# Start with random initial solution space (3)
solutions = [Solution(system, numdistricts) for _ in range(3)]
for s in solutions:
s.generate_random_solution() # Initialize our solutions
top_history = [] # Keep history of our top solution from each "frame"
for i in tqdm(range(precision)):
new_solutions = []
for _ in range(10): # Create 10 children per frame
s1, s2 = np.random.choice(solutions, size=2)
# Randomly combine two parents
new_solutions.append(s1.combine(s2))
# Combine everything, giving 13 total solutions
full_solutions = new_solutions + solutions
# Keep the top 3 for next generation
solutions = [_[0] for _ in
sorted([(s, s.value) for s in full_solutions],
key=lambda tup: -tup[1])[:3]]
# Only record top from generation, and only if it's changed
if len(top_history) == 0 or solutions[0] != top_history[-1]:
top_history.append(solutions[0])
solution = top_history[-1]
solution.count = precision
solution.algo = 'Genetic Algorithm'
print(solution)
print(solution.summary())
if animate:
animate_history(system.filename, system.matrix,
top_history, solution.numdistricts,
makegif)
k
impact solution selection?¶▼ Mask : class
+__init__ : function
-__str__ : function
▼+get_labels : function
+unlabelled : function
+is_valid : function
+location : function
+make_valid : function
+overlap : function
+parse_list : function
+parse_locations : function
+size : function
▼ Solution : class
-__eq__ : function
-__getitem__ : function
+__init__ : function
-__ne__ : function
-__str__ : function
+combine : function
+copy : function
+fill : function
+generate_random_solution : function
+get_district_neighbors : function
+get_filtered_district_neighbors : function
+get_full_openspots : function
+get_neighbors : function
+get_random_openspot : function
+get_solution : function
+height : function
+is_valid : function
+majorities : function
+majority : function
+mutate : function
+show : function
+summary : function
+value : function
+width : function
▼ System : class
-__getitem__ : function
+__init__ : function
+_name_arr : function
+_read_file : function
+empty_state : function
+height : function
+stats : function
+width : function
▼+animate_history : function
+update_plot : function
▼+generate_report_assets : function
+update : function
+genetic_algorithm : function
+get_args : function
+get_good_start : function
+main : function
+simulated_annealing : function
▼ variables
FIGSIZE
OUTDIR