‘On the Books’: Machine Learning Jim Crow
In 1950, the scholar-activist Pauli Murray published an important book titled States’ Laws on Race and Color. Sponsored by the Women’s Division of Christian Service of the Methodist Church, this 746-page volume was a compilation of race-based laws from every state in America. The book included not just segregation laws but rather any law pertaining to race at all, including those that protected civil rights. The statutes compiled ranged from segregation laws in Alabama to Minnesota’s 1857 law banning “slavery” or “involuntary servitude in the state otherwise than the punishment of crime.”1
....
Murray was correct in noting the expansive and invasive nature of Jim Crow laws. Their sheer scope, as Murray acknowledged, led to omissions in States’ Laws on Race and Color. It was simply impossible for a researcher in the 1940s to read every single law passed by each state, as “would be necessary,” Murray wrote, “to prevent such omissions.”
But today’s machines can do just that: reading hundreds of thousands of laws to identify those pertaining to race. Seventy years and many civil rights victories later, a team of librarians and historians at the University of North Carolina at Chapel Hill have devised a way to complete a portion of Murray’s task.
On the Books: Jim Crow and Algorithms of Resistance is the first and most complete collection of all Jim Crow laws from a single American state. As much as historians have learned about Jim Crow, it has previously been impossible to fully catalogue all Jim Crow laws from a single state because there were just so many. This new collection is the product of a machine learning exercise. Over the past two years, a team of researchers have used optical character recognition to create a text corpus of all North Carolina laws enacted by the state legislature between 1866-1967, parsed out the individual laws, and used an algorithm to identify the laws likely to be Jim Crow laws. The algorithm was trained based on laws identified by experts as either Jim Crow or not Jim Crow (including some of the laws identified Pauli Murray). Over 900 race-based statutes were identified as potential Jim Crow laws. This believed to be the most complete collection of Jim Crow laws from a single state ever assembled.
Inspired by Murray, this project offers a new and compelling way to study the history of Jim Crow. As Murray knew, Jim Crow laws were extraordinarily expansive, dictating minute details of public and private life that went well beyond the most obvious forms of racial segregation in public spaces. Jim Crow itself was a poisonous, invasive social order of racial apartheid, carefully crafted over decades by white supremacist legislators.