Using Super Mario to understand neural networks

I doubt I will ever program a neural network, but I’m trying to understand how they work — and how they are trained — well enough to make assumptions about how the systems work. What I want to be able to do is raise questions when I hear about a new-to-me AI system. I don’t want to take it on faith that a system is safe and likely to function well.

Ultimately I want to help my journalism and communications students understand this too.

Last week I discussed here a video about how neural networks work. Some time before I found that video, I had watched this one a couple of times. It’s from 2015 and it’s only 6 minutes long. It’s been viewed on YouTube more than 9 million times. In fact, it’s pretty close to 1 billion views!

Video game designer Seth Bling demonstrates a fully trained neural network that plays Mario expertly. Then he shows us how the system looks at the start, when the Mario character just stands in one place and dies every time. This is the untrained neural network, when it “knows” nothing.

Unlike the example in my earlier post — where the input to the neural network was an image of a handwritten number, and the output was the number (thereby “reading” the image) — here the input is the game state, which changes by the split second. The game state is a simplified digital representation of the Mario character, the surfaces he can run on or jump to, and any obstacles or rewards that are present. The output is which button should be pressed — holding down right continuously makes Mario run toward the right without stopping.

So the output layer in this neural network is the set of all possible actions Mario can take. For a human playing the game, these would be the buttons on the game controller.

In the training, Mario has a “fitness level,” which is a number. When Mario is dying all the time, that number stays around 2. When Mario reaches the end of the level without dying (but without scoring extra points), his fitness is 528. So by “looking at” the fitness level, the neural net assesses success. If the number has increased, then keep doing the same thing.

“The more lines and neurons you have, the more nuanced the decisions can be.”

—Seth Bling

Of course there are more actions than only moving right. Training the neural net to make Mario jump and perform more actions required many generations of neural nets, and only the best-performing ones were selected for the next generation. After 34 generations, the fitness level reached 4,000.

One thing I especially like about this video is the simultaneous visual of real Mario running in the real game level, along with a representation of the neural net showing its pathways in green and red. There is no code and no math in this video, and so while watching it, you are only thinking about how the connections come to be made and reinforced.

The method used is called NeuroEvolution of Augmenting Topologies (NEAT), which I’ve read almost nothing about — but apparently it enables the neural net to grow itself, essentially. Which is kind of mind blowing.

Bling shared his code here; it’s written in the Lua language.

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.


What is called ‘AI’ but really isn’t

Because “artificial intelligence” and “AI” have become such potent buzzwords in business — and so many firms are trying to sell some kind of “AI” system or software or strategy to every business possible — we should all take a step back and evaluate whether there is actual AI operating in some of these systems.

That won’t always be easy to discern. If a company claims there is “AI” in its product, they are not going to divulge exactly how it works. If they want to convince you, their literature or their engineers will likely throw out a tangled net of terms that, while accurate, might not help anyone but another engineer understand what’s inside the black box.

I was thinking about this recently as I worked on assignments for an online computer science course in AI. One of the early projects was to program a tic-tac-toe game in which a human can play against “an AI.” Just like most humans, the AI can force a tie in every tic-tac-toe game unless the human makes a mistake, and then the human will lose. I wrote the code that enables the AI to play — that was the assignment. But I didn’t invent the code from nothing. I was taught in the course to use an algorithm called minimax. Further, I was encouraged to make my program faster by using another algorithm called alpha-beta pruning.

Illustration of alpha-beta pruning (Wikipedia, by Jez9999, GNU license)

There is no machine learning involved in those two algorithms. They are simply a time-tested way for a computer language to direct a certain kind of look-ahead in a two-player game (not only tic-tac-toe).

Don’t despair or tune out — look at the diagram and understand that the computer, through instructions in my code, is able to rapidly advance through every possible outcome in tic-tac-toe and see how to: (a) prevent a win for the opponent, and (b) win if a win is possible.

There is no magic here.

Tic-tac-toe with “AI” playing X, human playing O.

Another assignment in the same course has the students programming “an AI” that plays Minesweeper. This game is quite different from tic-tac-toe in that there is only one player, and there is hidden knowledge: The player doesn’t know where the mines are. One move at a time, the player builds knowledge about the game board.

Completed Minesweeper game, with AI playing all moves.

A human player doesn’t click on a mine, because she chooses squares that are next to a 0 (indicating no mines touch that square) and marks a mine square when it becomes obvious that a mine is hidden there.

The “AI” builds knowledge in a way that it is programmed to do (that is the assignment). In this case, there is no pre-existing algorithm, but there are principles of logic. I programmed “knowledge” that was stored in the program each time the AI clicked a square and a number was revealed. The knowledge is: (a) that number, and (b) the coordinates of all the surrounding squares. Thus the AI “knows” that, for example, among eight specified squares there are two mines.

If among eight specified squares there are zero mines, my code tells the AI to mark all eight of those squares as safe. My code also tells the AI that if there are any safe moves left to be made, then make a safe move. If not, make a random move. That is the only time when the AI can possibly set off a mine.

Once again, there is no magic here.

In contrast to these two simple examples of a computer successfully playing a game, AlphaGo (which I wrote about previously) uses real AI and could not have beaten a human Go master otherwise. Some games can’t be programmed with only simple algorithms or logic — if they are to win, they need something akin to intuition.

Programming a computer to develop and use an approximation of human intuition is what we have in today’s machine learning with deep neural networks. It’s still not magic, but it’s a lot more complicated than the kind of strictly mapped-out processes I wrote for playing tic-tac-toe or Minesweeper.

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.


AI programs that play games

One of the very best media items I’ve found is this feature-length documentary about the program that beat an international master at the game of Go in 2016. It’s excellent as a documentary film — well-paced, sparking curiosity, exciting in some parts, and never pedantic.

You don’t need to understand anything about the game (which is immensely popular in China, Japan, and Korea, but not widely played elsewhere). It’s explained visually so that you can appreciate what’s going on. The film is free to watch on YouTube.

As a resource for learning about AI — or, more specifically, about machine learning — the film excels at helping us understand the work of the team of humans that created and trained the AlphaGo program. We don’t see a lot of people sitting at computer keyboards, typing. There are clustered people pointing at a screen, talking enthusiastically, or saying, “What happened there? Why did it do that?”

Probably my favorite moment in the film is after Lee Se-dol, the human Go master, has played a move that is so great, it was later referred to as “the God move.” The AlphaGo team begins analyzing the program’s responses in real-time, watching the graphs of its probability calculations on a large screen in their command center. For all the talk of AI as a black box that makes decisions humans can”t comprehend, this scene demonstrates that AI can be made transparent and accountable.

There’s much, much more to love about this documentary. The director, Greg Kohs, had extraordinary access to the DeepMind team during the months leading up to the five-game match with Lee. In the end, Google financed a general-audience-friendly film. (Google acquired DeepMind in 2014.)

In an interview with CNET, Kohs said the film “had very modest beginnings.”

“A couple members of Google’s creative lab that I’d worked with before gave me a ring and said we’d have access behind the curtain with [DeepMind founder and CEO] Demis Hassabis and his team. So I jumped on board with the expectation we would just film what happens for archival purposes and then put it on a shelf on a hard drive and that would be the end of it.”

Greg Kohs

Another wonderful aspect of the film is its humanity. I’ve seen a fair number of “scare essays” that predict the end of everything as AI gains dominance over its creators — but here we hear a more nuanced and thought-provoking set of views and reactions.

First, there is Lee, possibly the best (human) Go player who has ever lived, in closeup, in the very moment of his realization that the machine has bested him. Then there are the other Go experts, who understand more than you or I what the machine has actually done. Finally, there are the team members of DeepMind, who built the machine. Of course they are happy, ecstatically happy — but they are humbled, and even awed, as well.

At the end of 2019, Lee Se-dol retired as a professional Go player, at age 36. He is the only human who has ever defeated AlphaGo in tournament play.

More about AlphaGo:

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.