Exploring artificial intelligence and machine learning
Author: Mindy McAdams
The system described in this wonderful New Yorker article from March 2021 is NOT a neural network, and that’s one of the things that make it fascinating. I’ve written before about ImageNet and how neural networks, trained on humongous datasets of labeled digital images, are able to very accurately say what is in a photograph that the system has never “seen” before.
This is different.
This system, developed by a small company in Japan, does not require hundreds or thousands of images of each object it needs to identify precisely because it doesn’t use a neural network. The technologies it uses can be called good old-fashioned AI (GOFAI). Essentially it consists of a collection of manually constructed algorithms.
The system also “learns,” but not in the typical black-box sense of today’s machine learning systems. It is widely used in the checkout systems of Japanese bakeries, which offer a bewilderingly large assortment of pastries and small bread items, many of which look quite similar to one another. BakeryScan was released in 2013; it was 15 years in development.
More recently, the bakery system has been adapted to recognize specific types of cancer cells. The new system is able to “look at an entire microscope slide and identify the cells that might be cancerous” (source: The New Yorker article).
Rather than summarizing the article further, I’m just going to urge you to read it. It’s very much worth your time.
Cassie Kozyrkov, who wrote this article, is head of decision intelligence at Google. It starts out with what looks like a standard explanation of an image-recognition system — which she deprecatingly refers to as the “the cat/not-cat task.” But don’t be fooled — Kozyrkov communicates with clear, sharp precision, and very quickly she asks us to consider circumstances in which we would want a tiger to be considered a cat and those in which we would want it to be not-cat.
This leads to a discussion of ground truth. This is “an ideal expected result” — but for whom? Well, for the people who originally built the system. Kozyrkov notes that ground truth is NOT an objective, perfect truth like something studied in a philosophy class (Truth with a capital T). It’s whether a tiger is a cat in your reality or not-cat in mine.
I am reminded of one of my favorite lines in the rock opera Jesus Christ Superstar: “But what is truth? Is truth unchanging law? We both have truths. Are mine the same as yours?”
“When such a dataset is used to train ML/AI systems, systems based on it will inherit and amplify the implicit values of the people who decided what the ideal system behavior looked like to them.”
Say you are using an existing labeled dataset — not one you yourself have created — which is often the case. The labels attached to the data items are the ground truth for that dataset. If it’s a dataset of images, and some labels applied to photos of people are racist, then that’s the ground truth in that dataset. If it’s a dataset for sentiment analysis, and a lot of toxic comments are labeled “not toxic,” then that’s the ground truth you’re adopting.
It’s essential for developers to test systems extensively to uncover these flaws in ground truth.
“You wouldn’t want to fall victim to a myopic fraud detection system with sloppy definitions of what financial fraud looks like, especially if such a system is allowed to falsely accuse people without giving them an easy way to prove their innocence.”
— Cassie Kozyrkov
In a video embedded in the same article, Kozyrkov pithily proclaims: “There are only actually two real lines there. Here’s what they are: This objective. That data set.” (At 9:16.) Of course there’s a ton more code than that (she’s talking about the programming of the system that creates the model), but in terms of what you want the system to be able to do, that’s it in a nutshell: How have you framed your objective? And what’s in your dataset? More important, in many cases, is what’s NOT in your dataset.
She says this is where the core danger in AI lies, because in traditional programming “it might take 10,000 lines of code, a hundred thousand lines of code maybe, and some human being has to worry about every single one of those lines, agonize over it.” With supervised machine learning, you’ve only got the objective and the (gigantic) dataset, and the question is, Have enough people with expertise really agonized over each of those things?
My other favorite bits from the video:
“A system that is built and designed for one purpose may not work for a different purpose.” (6:17)
“Remember that the objective is subjective.” (6:31)
“And if you take those two parts really seriously, that is how you are going to build a safe and effective and kind AI system.” (20:16)
Sometimes I read something that is like a voice out of my own head:
“Artificial intelligence is a buzzword increasingly being used by companies around the world that seek to project themselves at the forefront of cutting-edge research … As the word loses its meaning, it is important for investors to understand what artificial intelligence is and what companies stand to gain from breakthroughs in the new technology.”
That comes from an article titled “10 Best Artificial Intelligence Stocks to Buy for 2021” (link above) but it’s more than just a list of stock tips. It points out that “technology firms with social media services” (e.g. Facebook) are hot because they have the massive datasets that power machine learning about consumers. Companies that make super-fast computer hardware — particularly graphical processing units (GPUs) that crunch through that data — are also good bets (although I’ve heard about growing hardware shortages due to the pandemic).
The article’s author refers to hedge-fund investments as an indicator, which might make me leery about investing my own hard-earned cash, but the list of companies still interested me. Along with hardware manufacturers such as Micron Technology and Nvidia; Amazon, which is valuable for more than only its growing AI expertise; and Alphabet Inc., the parent of Google and DeepMind — the list also includes:
Adobe, which is “integrating data-based learning into most of its software through Adobe Sensei, a tool that uses artificial intelligence to improve user experiences across a wide range of Adobe products.”
Facebook — this is Yahoo! FInance’s No. 1 pick, and with its deep pockets, Facebook is certainly able to acquire some of the best research minds in AI today. Its efforts are grouped under the Facebook AI label, and the breadth of its work is visible on this page.
Microsoft, which “has a separate artificial intelligence unit called Microsoft AI that helps users, organizations, and governments across the world with machine learning, data analytics, robotics, and internet of things products.” Just this week, Microsoft to announced a $16 billion cash deal to buy Nuance, which develops AI software including speech-recognition products (Dragon is one). Microsoft pointed to Nuance’s position in the healthcare market as a primary reason for the acquisition.
Pinterest, because it is using AI to sort and categorize the millions of images shared by its users and also to “tailor the experiences” of users. Note, news organizations such as The New York Times are also using AI to determine how content is presented to users.
Salesforce.com, which “provides customer relationship management services and other enterprise solutions on market automation, data analytics, and application development.” The company markets its AI products under the Einstein brand — see AI use cases from the company. Salesforce acquired Slack Technologies last year.
Notably absent from the list is Apple (although maybe not a great investment, due to its high valuation), which is no newcomer to incorporating AI into its products. Critics might pooh-pooh Apple’s AI clout, but machine learning has been integral to the iPhone, iPad, and Apple Watch for years. Ars Technica published an excellent article about this in mid-2020.
Another absence is the assorted promising startups — particularly those in the climate arena and those founded by alumni of DeepMind, which to me is the most fantastic incubator of AI talent (see AlphaFold) outside the top universities. Just this week, Google put money into one of those startups — founded by a former research engineer at DeepMind, and “focused on reducing greenhouse gas emissions.”
I got my first look at spaCy, a Python library for natural language processing, near the end of 2019. I wanted to learn it but had too many other things to do. Fast-forward to now, almost 14 months into the pandemic, and I recently stumbled across spaCy’s own tutorial for learning to use the library.
The interactive tutorial includes videos, slides, and code exercises, and there is a GitHub repo. It is available in English, Deutsch, Español, Français, Português, 日本語, and 中文. Today I completed chapter 2. If you already know Python at, say, an intermediate level, check it out!
In chapter 1 (there are four chapters), I got a handle on part-of-speech tags, syntactic dependencies, and named entities. I learned that we can search on these, and also on words (tokens) related to combinations that we define. I’ve known about large-scale document searches (where a huge collection of documents is searched programmatically, usually to extract the most meaningful docs for some purpose — like a journalism investigation), and now I was getting a much better idea of how such searches can be designed.
SpaCy provides “pre-trained model packages,” meaning someone else has already done the hard work of machine learning/training to generate word vectors. There are packages of various sizes and in various languages. Loading a model provides various features (the bigger the model, the more features).
I think I was hooked as soon as I saw this and realized you could ask for all the MONEY entities, or all the ORG entities, in a document and evaluate them:
Then (still in chapter 1) I learned that I can easily define my own entities if the model doesn’t recognize the ones I need to find. I learned that if I don’t know what GPE is, I can enter spacy.explain("GPE") and spaCy will return 'Countries, cities, states' — sweet!
Then I learned about rule-based matching, and I thought: “Regular expressions, buh-bye!”
Chapter 1 didn’t really get deeply into lemmatization, but it offered this:
That was just chapter 1! Chapter 2 went further into creating your own named entities and using parts of speech as part of your search criteria. For example, if you want to find all instances where a particular entity (say, a city) is followed by a verb — any verb — you can do that. Or any part of speech. You can construct a complex pattern, mixing specific words, parts of speech, and selected types of entities. The pattern can include as many tokens as you want. (If you’re familiar with regex — all the regex things are available.)
You can determine whether phrases or sentences are similar to each other (although imperfectly).
I’m not entirely sure how I would use these, but I’m sure they’re good for something:
.root — the token that decides the category of the phrase
.head — the syntactic “parent” that governs the phrase
There is an exercise in which I matched country names and their root head token (span.root.head), which gave me a bit of a clue as to how useful that might be in some circumstances.
Also in chapter 2, I learned how to use an imported JSON file to add 240 country names as GPE entities — obviously, the imported terms could be any kind of entity.
So, I’m feeling very excited about spaCy! Halfway through the tutorial!
Machine learning systems for image recognition aren’t always perfect — and neither are AI systems marketed for medical use, whether they use image recognition or not. But here’s an example of image recognition used in a medical context where the system appears to have succeeded at something significant — and it’s something humans can’t do, or at least can’t do well.
“Researchers used the AI tool Subtype and Stage Inference (SuStaIn) to scan the MRI brain scans of 6,322 patients with MS, letting SuStaIn train itself unsupervised. The AI identified 3 previously unknown patterns …” (Pharmacy Times). The model was then tested on MRIs from “a separate independent cohort of 3,068 patients” and successfully identified the three new MS subtypes in them.
Subtype and Stage Inference (SuStaIn) was introduced in this 2018 paper. It is an “unsupervised machine-learning technique that identifies population subgroups with common patterns of disease progression” using MRI images. The original researchers were studying dementia.
Why does it matter? Identifying the subtype of the disease multiple sclerosis (MS) enables doctors to pursue different treatments for them, which might lead to better results for patients.
“While further clinical studies are needed, there was a clear difference, by subtype, in patients’ response to different treatments and in accumulation of disability over time. This is an important step towards predicting individual responses to therapies,” said Dr. Arman Eshaghi, the lead researcher (EurekAlert).
In the latest JournalismAI newsletter, a list of recommendations called “Reporting on AI Effectively” shares wisdom from several journalists who are reporting about a range of artificial intelligence and machine learning topics. The advice is grouped under these headings:
Build a solid foundation
Beat the hype
Complicate the narrative
Be compassionate, but embrace critical thinking
Karen Hao, senior AI editor at MIT Technology Review — whose articles I read all the time! — points out that to really educate yourself about AI, you’re going to need to read some of the research papers in the field. She also recommends YouTube as a resource for learning about AI — and I have to agree. I’ve never used YouTube so much to learn about a topic before I began studying AI.
The post also offers good advice about questions a reporter should ask about AI research and new developments int the field.
The Biden Administration is working hard in a wide range of areas, so maybe it’s no surprise that HHS released this report, titled Artificial Intelligence (AI) Strategy (PDF), this month.
“HHS recognizes that Artificial Intelligence (AI) will be a critical enabler of its mission in the future,” it says on the first page of the 7-page document. “HHS will leverage AI to solve previously unsolvable problems,” in part by “scaling trustworthy AI adoption across the Department.”
So HHS is going to be buying some AI products. I wonder what they are (will be), and who makes (or will make) them.
“HHS will leverage AI capabilities to solve complex mission challenges and generate AI-enabled insights to inform efficient programmatic and business decisions” — while to some extent this is typical current business jargon, I’d like to know:
Which complex mission challenges? What AI capabilities will be applied, and how?
Which programmatic and business decisions? How will AI-enabled insights be applied?
These are the kinds of questions journalists will need to ask when these AI claims are bandied about. Name the system(s), name the supplier(s), give us the science. Link to the relevant research papers.
I think a major concern would be use of any technologies coming from Amazon, Facebook, or Google — but I am no less concerned about government using so-called solutions peddled by business-serving firms such as Deloitte.
The following executive orders (both from the previous administration) are cited in the HHS document:
The department will set up a new HHS AI Council to identify priorities and “identify and foster relationships with public and private entities aligned to priority AI initiatives.” The council will also establish a Community of Practice consisting of AI practitioners (page 5).
Four key focus areas:
An AI-ready workforce and AI culture (includes “broad, department-wide awareness of the potential of AI”)
AI research and development in health and human services (includes grants)
“Democratize foundational AI tools and resources” — I like that, although implementation is where the rubber meets the road. This sentence indicates good aspirations: “Readily accessible tools, data assets, resources, and best practices will be critical to minimizing duplicative AI efforts, increasing reproducibility, and ensuring successful enterprise-wide AI adoption.”
“Promote ethical, trustworthy AI use and development.” Again, a fine statement, but let’s see how they manage to put this into practice.
I had not been all that interested to learn about perceptrons, even though the perceptron is known as an ancestor of present-day machine learning.
That changed when I read an account that said the big names in AI in the 1960s were convinced that symbolic AI was the road to glory — and their misplaced confidence smothered the development of the first systems that learned and modified their own code.
Symbolic AI is built with strictly programmed rules. Also known as “good old-fashioned AI,” or GOFAI, the main applications you can produce with symbolic AI are expert systems.
The original perceptron was conceived and programmed by Frank Rosenblatt, who earned his Ph.D. in 1956. A huge IBM computer running his code was touted by the U.S. Office of Naval Research in 1958 as “capable of receiving, recognizing and identifying its surroundings without any human training or control,” according to a New York Times article published on July 8, 1958. That was hype, but the perceptron actually did receive visual information from the environment and learn from it, in much the same way as today’s ML systems do.
“At the time, he didn’t know how to train networks with multiple layers. But in hindsight, his algorithm is still fundamental to how we’re training deep networks today.”
After leading AI researchers Marvin Minsky and Seymour Papert, both of MIT, published a book in 1969 that essentially said perceptrons were a dead end, all the attention — and pretty much all the funding — went to symbolic AI projects and research. Symbolic AI was the real dead end, but it took 50 years for that truth to be fully accepted.
Frank Rosenblatt died in a boating accident on his 43rd birthday, according to his obituary in The New York Times. It was 1971. Had he lived, he might have trained dozens of AI researchers who could have gone on to change the field much sooner.
Descriptions of machine learning are often centered on training a model. Not having a background in math or statistics, I was puzzled by this the first time I encountered it. What is the model?
This 10-minute video first describes how you select labeled data for training. You examine the features in the data, so you know what’s available to you (such as color and alcohol content of beers and wines). Then the next step is choosing the model that you will train.
In the video, Yufeng Guo chooses a small linear model without much explanation as to why. For those of us with an impoverished math background, this choice is completely mysterious. (Guo does point out that some models are better suited for image data, while others might be better suited for text data, and so on.) But wait, there’s help. You can read various short or long explanations about the kinds of models available.
It’s important for the outsider to grasp that this is all code. The model is an algorithm, or a set of algorithms (not a graph). But this is not the final model. This is a model you will train, using the data.
What are you doing while training? You are — or rather, the system is — adjusting numbers known as weights and biases. At the outset, these numbers are randomly selected. They have no meaning and no reason for being the numbers they are. As the data go into the algorithm, the weights and biases are used with the data to produce a result, a prediction. Early predictions are bad. Wine is called beer, and beer is called wine.
The output (the prediction) is compared to the “correct answer” (it is wine, or it is beer). The weights and biases are adjusted by the system. The predictions get better as the training data are run again and again and again. Running all the data through the system once is called an epoch; the weights and biases are not adjusted until after all the data have run through once. Then the adjustment. Then run the data again. Epoch 2: adjust, repeat. Many epochs are required before the predictions become good.
After the predictions are good for the training data, it’s time to evaluate the model using data that were set aside and not used for training. These “test data” (or “evaluation data”) have never run through the system before.
The results from the evaluation using the test data can be used to further fine-tune the system, which is done by the programmers, not by the code. This is called adjusting the hyperparameters and affects the learning process (e.g., how fast it runs; how the weights are initialized). These adjustments have been called “a ‘black art’ that requires expert experience, unwritten rules of thumb, or sometimes brute-force search” (Snoek et al., 2012).
And now, what you have is a trained model. This model is ready to be used on data similar to the data it was trained on. Say it’s a model for machine vision that’s part of a robot assembling cars in a factory — it’s ready to go into all the robots in all the car factories. It will see what it has been trained to see and send its prediction along to another system that turns the screw or welds the door or — whatever.
And it’s still just — code. It can be copied and sent to another computer, uploaded and downloaded, and further modified.
In thinking about how to teach non–computer science students about AI, I’ve been considering what fundamental concepts they need to understand. I was thinking about models and how to explain them. My searches led me to this 8-minute BBC video: What exactly is an algorithm?
I’ve explained algorithms to journalism students in the past — usually I default to the “a set of instructions” definition and leave it at that. What I admire about this upbeat, lively video is not just that it goes well beyond that simple explanation but also that it brings in experts to talk about how various and wide-ranging algorithms are.
The young presenter, Jon Stroud, starts out with no clue what algorithms are. He begins with some web searching and finds Victoria Nash, of the Oxford Internet Institute, who provides the “it’s like a recipe” definition. Then he gets up off his butt and visits the Oxford Internet Institute, where Bernie Hogan, senior research fellow, gives Stroud a tour of the server room and a fuller explanation.
“Algorithms calculate based on a bunch of features, the sort of things that will put something at the top of the list and then something at the bottom of the list.”
—Bernie Hogan, Oxford Internet Institute
He meets up with Isabel Maccabee at Northcoders, a U.K. coding school, and participates in a fun little drone-flying competition with an algorithm.
“The person writing the code could have written an error, and that’s where problems can arise, but the computer doesn’t make mistakes. It just does what it’s supposed to do.”
—Isabel Maccabee, Northcoders
Stroud also visits Allison Gardner, of Women Leading in AI, to talk about deskilling and the threats and benefits of computers in general.
This video provides an enjoyable introduction with plenty of ideas for follow-up discussion. It provides a nice grounding that includes the fact that not everything powerful about computer technology is AI!