A blast from the AI past: Perceptrons

I had not been all that interested to learn about perceptrons, even though the perceptron is known as an ancestor of present-day machine learning.

That changed when I read an account that said the big names in AI in the 1960s were convinced that symbolic AI was the road to glory — and their misplaced confidence smothered the development of the first systems that learned and modified their own code.

Symbolic AI is built with strictly programmed rules. Also known as “good old-fashioned AI,” or GOFAI, the main applications you can produce with symbolic AI are expert systems.

The original perceptron was conceived and programmed by Frank Rosenblatt, who earned his Ph.D. in 1956. A huge IBM computer running his code was touted by the U.S. Office of Naval Research in 1958 as “capable of receiving, recognizing and identifying its surroundings without any human training or control,” according to a New York Times article published on July 8, 1958. That was hype, but the perceptron actually did receive visual information from the environment and learn from it, in much the same way as today’s ML systems do.

“At the time, he didn’t know how to train networks with multiple layers. But in hindsight, his algorithm is still fundamental to how we’re training deep networks today.”

Thorsten Joachims, professor, computer science, quoted in the Cornell Chronicle

After leading AI researchers Marvin Minsky and Seymour Papert, both of MIT, published a book in 1969 that essentially said perceptrons were a dead end, all the attention — and pretty much all the funding — went to symbolic AI projects and research. Symbolic AI was the real dead end, but it took 50 years for that truth to be fully accepted.

Frank Rosenblatt died in a boating accident on his 43rd birthday, according to his obituary in The New York Times. It was 1971. Had he lived, he might have trained dozens of AI researchers who could have gone on to change the field much sooner.

An excellent article about Rosenblatt’s work is Professor’s perceptron paved the way for AI — 60 years too soon, published by Cornell University in 2019.

Using machine learning to uncover racist laws

A common use of machine learning is to train a model to identify a particular kind of document, or a particular characteristic in a document — and then sort a gigantic set of documents. This produces a much-reduced subset of all documents that match the desired criteria. There might be some false positives in the subset, but it still gives researchers or journalists a big jump forward by eliminating thousands of unwanted documents.

This kind of sorting goes well beyond a simple search for keywords.

Above: Screenshot from On the Books at lib.unc.edu

A great example has emerged from the University of North Carolina at Chapel Hill. On the Books: Jim Crow and Algorithms of Resistance is a project that includes a public plain-text collection of North Carolina laws (1866–1967) likely to be Jim Crow laws.

There is a public GitHub repo of the code used in this project. It includes a full walkthrough of the project’s workflow — data acquisition and cleaning, OCR, unsupervised and supervised classification, etc.

The base document set (the main corpus) consists of 96 volumes, with 53,515 chapters, having 297,790 sections (source).

The project’s title gives homage to Safiya Noble’s 2018 book Algorithms of Oppression: How Search Engines Reinforce Racism.

“State-based racial segregation laws were incredibly inconvenient, irregular, and, most importantly, unconstitutional.”

—William Sturkey, Ph.D.

A historical perspective on this data collection was provided by William Sturkey, a history professor at UNC, in “On the Books”: Machine Learning Jim Crow (September 2020). He says On the Books is “the first and most complete collection of all Jim Crow laws from a single American state.” He points to the difficulty of cataloging and studying all Jim Crow laws from any state “because there were just so many.”

