Symbolic AI: Good old-fashioned AI

The distinction between symbolic (explicit, rule-based) artificial intelligence and subsymbolic (e.g. neural networks that learn) artificial intelligence was somewhat challenging to convey to non–computer science students. At first I wasn’t sure how much we needed to dwell on it, but as the semester went on and we got deeper into the differences among types of neural networks, it was very useful to keep reminding the students that many of the things neural nets are doing today would simply be impossible with symbolic AI.

The difficulty lies in the shallow math/science background of many communications students. They might have studied logic problems/puzzles, but their memory of how those problems work might be very dim. Most of my students have not learned anything about computer programming, so they don’t come to me with an understanding of how instructions are written in a program.

This post by Ben Dickson at his TechTalks blog offers a very nice summary of symbolic AI, which is sometimes referred to as good old-fashioned AI (or GOFAI, pronounced GO-fie). This is the AI from the early years of AI, and early attempts to explore subsymbolic AI were ridiculed by the stalwart champions of the old school.

The requirements of symbolic AI are that someone — or several someones — needs to be able to specify all the rules necessary to solve the problem. This isn’t always possible, and even when it is, the result might be too verbose to be practical. As many people have said, things that are easy for humans are hard for computers — like recognizing an oddly shaped chair as a chair, or distinguishing a large upholstered chair from a small couch. Things we do almost without thinking are very hard to encode into rules a computer can follow.

“Symbolic artificial intelligence is very convenient for settings where the rules are very clear cut, and you can easily obtain input and transform it into symbols.”

—Ben Dickson

Subsymbolic AI does not use symbols, or rules that need symbols. It stems from attempts to write software operations that mimic the human brain. Not copy the way the brain works — we still don’t know enough about how the brain works to do that. Mimic is the word usually used because a subsymbolic AI system is going to take in data and form connections on its own, and that’s what our brains do as we live and grow and have experiences.

Dickson uses an image-recognition example: How would you program specific rules to tell a symbolic system to recognize a cat in a photo? You can’t write rules like “Has four legs,” or “Has pointy ears,” because it’s a photo. Your rules would need to be about pixels and edges and clusters of contrasting shades. Your rules would also need to account for infinite variations in photos of cats.

“You can’t define rules for the messy data that exists in the real world.”

—Ben Dickson

Thus “messy” problems such as image recognition are ideally handled by neural networks — subsymbolic AI.

Problems that can be drawn as a flow chart, with every variable accounted for, are well suited to symbolic AI. But scale is always an issue. Dickson mentions expert systems, a classic application of symbolic AI, and notes that “they require a huge amount of effort by domain experts and software engineers and only work in very narrow use cases.” On top of that, the knowledge base is likely to require continual updating.

An early, much-praised expert system (called MYCIN) was designed to help doctors determine treatment for patients with blood diseases. In spite of years of investment, it remained a research project — an experimental system. It was not sold to hospitals or clinics. It was not used in day-to-day practice by any doctors diagnosing patients in a clinical setting.

“I have never done a calculation of the number of man-years of labor that went into the project, so I can’t tell you for sure how much time was involved … it is such a major chore to build up a real-world expert system.”

—Edward H. Shortliffe, principal developer of the MYCIN expert system (source)

Even though expert systems are impractical for the most part, there are other useful applications for symbolic AI. Dickson mentions “efforts to combine neural networks and symbolic AI” near the end of his post. He points out that symbolic systems are not “opaque” the way neural nets are — you can backtrack through a decision or prediction and see how it was made.


Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.


How to educate the public about AI

Two new items related to educating the general public about artificial intelligence:

The A–Z guide comes from the Oxford Internet Institute and Google. It’s slick, pretty, and animated. It consists of exactly 26 short items, one for each letter of the alphabet: artificial intelligence, bias, climate, datasets, ethics, fakes, etc. The aim is to provide answers in a not-overwhelming way.

I love the idea, but I’m not in love with the execution. For example, the neural networks piece tells us that neural nets “attempt to mimic the structure of the brain,” but they “cannot ‘think’ like humans.” That’s great — clear and accurate. We could quibble about “attempt to mimic the structure,” but we can also let that slide. But then:

“AI design teams can assign each piece of a network to recognizing one of many characteristics. The sections of the network then work as one to build an understanding of the relationships and correlations between those elements — working out how they typically fit together and influence each other.”

To me, that seems misleading. It sounds as if the layers of the neural net are directed by specifically programmed instructions, but all my reading has indicated that the layers determine on their own which features they are detecting. (I’m thinking specifically about image recognition and supervised learning here.) This is important because it contributes to the “black box” problem of machine learning systems.

I also dislike phrases such as “build an understanding,” because that implies more intentionality than these networks actually have.

Giving people short, understandable explanations of specific aspects of AI is a wonderful idea, but the explanations need to be both straightforward and true.

The second education item I linked above comes from MIT’s news office. It describes a “new cross-disciplinary research initiative … to promote the understanding and use of AI across all segments of society.”

“People need to be AI-literate to understand the responsible use of AI and create things with it at individual, community, and societal levels.”

—Cynthia Breazeal, MIT professor, director of Responsible AI for Social Empowerment and Education (RAISE)

This sentiment is becoming more widely voiced as claims for the benefits of AI increase in the media. The idea behind RAISE is good and admirable — yes, people in all walks of life should have some understanding of AI, at least as much as they have an understanding of what makes airplanes fly and what makes computers able to store and retrieve our vacation photos.

Oh, wait.

In the United States, the average person’s understanding of any process involving physics or electronics might not be very good. Many students with stellar high-school grades don’t have a solid grasp of how their laptops or phones work at a basic level. I’m not talking about the students who attend MIT, but I am talking about those who can manage high SAT scores and gain admission to top public universities.

The RAISE initiative has identified four strategic areas for research, education, and outreach:

  • Diversity and inclusion in AI
  • AI literacy in pre-K–12 education
  • AI workforce training
  • AI-supported learning

But let’s go back to the A–Z guide and look at the segment about binary code, Zeros & Ones. It tells us that 0’s and 1’s are “the foundational language of computers.” It tells us that a particular long sequence of 0’s and 1’s means “Hello” to a computer. In one sense, that is true — but it really explains nothing to a layman. A computer system doesn’t know what “Hello” is (or means) any more than a rock does.

To accomplish AI literacy, we need to accomplish computer literacy. We need to teach and explain — clearly and accurately — to students at all levels what computers can and cannot do, how they are programmed, and how AI is different from, say, writing a game program that plays tic-tac-toe as well as any human can. I can write and run a winning tic-tac-toe program on an average laptop if I know which algorithms to use in my code — but there’s nothing remotely like intelligence in that program.

We need to add caveats every time we say something like “the computer learns,” or “the system understands.”

It will be fantastic if RAISE (and other outreach programs) can raise the level of computer literacy among Americans. It’s an important goal in this era of AI hype and euphoric claims, because it will be so much easier for people to be duped, exploited, mistreated, sidelined, marginalized, and/or denied jobs, loans, mortgages, healthcare, or admission to universities if they don’t understand what AI is and how it works.


Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.


AI building blocks: What are models?

Descriptions of machine learning are often centered on training a model. Not having a background in math or statistics, I was puzzled by this the first time I encountered it. What is the model?

This 10-minute video first describes how you select labeled data for training. You examine the features in the data, so you know what’s available to you (such as color and alcohol content of beers and wines). Then the next step is choosing the model that you will train.

In the video, Yufeng Guo chooses a small linear model without much explanation as to why. For those of us with an impoverished math background, this choice is completely mysterious. (Guo does point out that some models are better suited for image data, while others might be better suited for text data, and so on.) But wait, there’s help. You can read various short or long explanations about the kinds of models available.

It’s important for the outsider to grasp that this is all code. The model is an algorithm, or a set of algorithms (not a graph). But this is not the final model. This is a model you will train, using the data.

What are you doing while training? You are — or rather, the system is — adjusting numbers known as weights and biases. At the outset, these numbers are randomly selected. They have no meaning and no reason for being the numbers they are. As the data go into the algorithm, the weights and biases are used with the data to produce a result, a prediction. Early predictions are bad. Wine is called beer, and beer is called wine.

The output (the prediction) is compared to the “correct answer” (it is wine, or it is beer). The weights and biases are adjusted by the system. The predictions get better as the training data are run again and again and again. Running all the data through the system once is called an epoch; the weights and biases are not adjusted until after all the data have run through once. Then the adjustment. Then run the data again. Epoch 2: adjust, repeat. Many epochs are required before the predictions become good.

After the predictions are good for the training data, it’s time to evaluate the model using data that were set aside and not used for training. These “test data” (or “evaluation data”) have never run through the system before.

The results from the evaluation using the test data can be used to further fine-tune the system, which is done by the programmers, not by the code. This is called adjusting the hyperparameters and affects the learning process (e.g., how fast it runs; how the weights are initialized). These adjustments have been called “a ‘black art’ that requires expert experience, unwritten rules of thumb, or sometimes brute-force search” (Snoek et al., 2012).

And now, what you have is a trained model. This model is ready to be used on data similar to the data it was trained on. Say it’s a model for machine vision that’s part of a robot assembling cars in a factory — it’s ready to go into all the robots in all the car factories. It will see what it has been trained to see and send its prediction along to another system that turns the screw or welds the door or — whatever.

And it’s still just — code. It can be copied and sent to another computer, uploaded and downloaded, and further modified.

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.


AI building blocks: What are algorithms?

In thinking about how to teach non–computer science students about AI, I’ve been considering what fundamental concepts they need to understand. I was thinking about models and how to explain them. My searches led me to this 8-minute BBC video: What exactly is an algorithm?

I’ve explained algorithms to journalism students in the past — usually I default to the “a set of instructions” definition and leave it at that. What I admire about this upbeat, lively video is not just that it goes well beyond that simple explanation but also that it brings in experts to talk about how various and wide-ranging algorithms are.

The young presenter, Jon Stroud, starts out with no clue what algorithms are. He begins with some web searching and finds Victoria Nash, of the Oxford Internet Institute, who provides the “it’s like a recipe” definition. Then he gets up off his butt and visits the Oxford Internet Institute, where Bernie Hogan, senior research fellow, gives Stroud a tour of the server room and a fuller explanation.

“Algorithms calculate based on a bunch of features, the sort of things that will put something at the top of the list and then something at the bottom of the list.”

—Bernie Hogan, Oxford Internet Institute

He meets up with Isabel Maccabee at Northcoders, a U.K. coding school, and participates in a fun little drone-flying competition with an algorithm.

“The person writing the code could have written an error, and that’s where problems can arise, but the computer doesn’t make mistakes. It just does what it’s supposed to do.”

—Isabel Maccabee, Northcoders

Stroud also visits Allison Gardner, of Women Leading in AI, to talk about deskilling and the threats and benefits of computers in general.

This video provides an enjoyable introduction with plenty of ideas for follow-up discussion. It provides a nice grounding that includes the fact that not everything powerful about computer technology is AI!

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.


How to start learning about algorithms

After writing yesterday’s post, I was thinking about how much students should know about algorithms if they are to have a basic understanding of how AI works. Is it enough to tell them an algorithm is a set of instructions?

So I turned, as I often do, to Khan Academy — a free online learning site that often helps me through my lack of a mathematics background. I found a set of three short lessons, starting with a video.

Screenshot from Khan Academy video

In the introductory video, “What is an algorithm and why should you care?”, we see various practical uses of algorithms, followed by the statement above, and a brief description of how route finding works — what Google Maps does when it gives you directions. Route finding is often used as an example of accepting a “good enough” output for the sake of speed (that is, efficiency).

Watching the animation, we comprehend that the computer is following a set of instructions to determine a good route for a delivery truck with 25 stops to make. We see the process of the algorithm at work, rather than seeing formulas and equations.

I love that the video also shows us, with animation, how the efficiency of an algorithm is calculated.

The second lesson, “A guessing game,” demonstrates binary search (an algorithm) by allowing you to discover it interactively. Wonderful!

The third lesson, “Route-finding,” is much more reading intensive. It explains the algorithm in terms of solving a maze. Without knowing the exact path to solve the maze, the algorithm can “know” which choice for its next step takes it closer to the goal (the center of the maze). I don’t consider this lesson very helpful, but that’s because I saw a much better explanation of maze-solving algorithms here:

Start video at 54:35 for demo of the greedy best-first search algorithm

I am continually amazed and humbled by the variety of ways in which people teach these concepts. More important, I realize how some ways of explaining a concept are not at all effective — for me, at least — and another way of explaining makes it clear as crystal.

So, how much should students know about algorithms, if they are to have a general understanding of AI? I think a good start would be to watch and discuss the introductory Khan Academy video, and also to see a further visual (probably animated) representation of another kind of algorithm at work.

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.


What is a neural network and how does it work?

The most wonderful thing about YouTube is you can use it to learn just about anything.

One of the 10,000 annoying things about YouTube is finding a good, satisfying version of the lesson you want to learn can take hours of searching. This is especially true of videos about technical aspects of machine learning. Of course there are one- and two-hour recordings of course lectures by computer science professors. But I’ve been seeking out shorter videos with more animations and illustrations of concepts.

Understanding what a neural network is and how it processes data is necessary to demystifying machine learning. Data goes in, results come out — but in between is a “black box” consisting of code and hardware. It sort of works like a human brain, and yet, it really doesn’t.

So here at last is a painless, math-free video that walks us through a neural network. The particular example shown uses the MNIST dataset, which consists of 70,000 images of handwritten digits, 0–9. So the task being performed is the recognition of those digits. (This kind of system can be used to sort mail using postal codes, for example.)

What you’ll see is how the first layer (a vertical line of circles on the left side) represents the input. If each of the MNIST images is 28 pixels wide by 28 pixels high, then that first layer has to represent 784 pixels and each of their color values — which is a number. (One image is the input — only one at a time.)

The final vertical layer, all the way to right side, is the output of the neural network. In this example, the output tells us which digit was in the input — 0, 1, 2, etc. To see the value in this, go back to the mail-sorting idea. If a system can read postal codes, it recognizes several numbers and then transmits them to another system that “knows” which postal code goes to which geographical location. My letter gets sorted into the Florida bin and yours into the bin for your home.

In between the input and the output are the vertical “hidden” layers, and that’s where the real work gets done. In the video you’ll see that the number of circles — often called neurons, but they can also be called just units — in a hidden layer might well be less than the number of units in the input layer. The number of units in the output layer can also differ from the numbers in other layers.

When the video describes edge detection, you might recall an earlier post here.

Beautifully, during an animation, our teacher Grant Sanderson explains and shows that the weights exist not in or on the units (the “neurons”) but in fact in or on the connections between the units.

Okay, I lied a little. There is some math shown here. The weight assigned to the connection is multiplied by the value of the unit to the left. The results are all summed, for all left-side units, and that sum is assigned to the unit to the right (meaning the right side of that one connection).

The video bogs down just a bit between the Sigmoid squishification function and applying the bias, but all you really need to grasp is that the value of the right-side unit shows whether or not that little region of the image (in this case, it’s an image) has a significant difference. The math is there to determine if the color, the amount of color, is significant enough to count. And how much it should count.

I know — math, right?

But seriously, watch the video. It’s excellent.

“And that’s a lot to think about! With this hidden layer of 16 neurons, that’s a total of 784 times 16 weights, along with 16 biases. And all of that is just the connections from the first layer to the second.”

—Grant Sanderson, But what is a neural network? (video)

Sanderson doesn’t burden us with the details of the additional layers. Once you’ve seen the animations for that first step — from the input layer through the connections to the first hidden layer — you’ll have a real appreciation for what’s happening under the hood in a neural network.

In the final 6 minutes of this 19-minute video, you’ll also learn how the “learning” takes place in machine learning when a neural net is involved. All those weights and bias values? They are not determined by humans.

“Digging into what the weights and biases are doing is a good way to challenge your assumptions and really expose the full space of possible solutions.”

—Grant Sanderson, But what is a neural network? (video)

I confess it does get rather mathy at the end, but hang on through the parts that are beyond your personal math background and listen to what Sanderson is telling us. You can get a lot out of it even if the equation itself is like hieroglyphics to you.

The video content ends at 16:26, followed by the usual “subscribe to my channel” message. More info about Sanderson and his excellent videos is on his website, 3Blue1Brown.

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.