For Friday AI Fun, let’s look at an oldie but goodie: Google’s Quick, Draw!
You are given a word, such as whale, or bandage, and then you need to draw that in 20 seconds or less.
Thanks to this game, Google has labeled data for 50 million drawings made by humans. The drawings “taught” the system what people draw to represent those words. Now the system uses that “knowledge” to tell you what you are drawing — really fast! Often it identifies your subject before you finish.
It is possible to stump the system, even though you’re trying to draw what it asked for. My drawing of a sleeping bag is apparently an outlier. My drawings of the Mona Lisa and a rhinoceros were good enough — although I doubt any human would have named them as such!
Google’s AI thought my sleeping bag might be a shoe, or a steak, or a wine bottle.
The system has “learned” to identify only 345 specific things. These are called its categories.
You can look at the data the system has stored — for example, here are a lot of drawings of beard.
If a computer can correctly identify an object (an apple, a tricycle) or an animal such as a zebra, can it produce a drawing of that object or animal? This is something most people can do, even if their drawing skills are minimal. After all, almost anyone can play Pictionary.
This 8-minute video shows us what happened when a programmer-artist reversed the process of an AI that recognizes objects and animals in digital images. I really admire the deft storytelling here.
Object recognition has improved amazingly in the past 10 years, but that does not mean these AI systems see the same way as a human does. In some cases, that might not matter at all. In other cases, it can mean the difference between life and death.
In yesterday’s post I mentioned the way a convolutional neural network (part of a machine learning system) processes an image through many stacked layers of detection units (sometimes called neurons), identifying edges and shapes that eventually lead to a conclusion that the image is likely to contain such-and-such an object, animal, or person. Today’s video shows a bit more about the training process that an AI goes through before it can perform these identifications.
Training is necessary in the type of machine learning called supervised learning. The training data (in this case, digital images of objects and animals) must be labeled in advance. That is, the system receives thousands of images labeled “tiger” before it is able to recognize a tiger in a random photo or video. If a system can identify 20 different animals, that system was trained on thousands of images of each animal.
If the system was never trained on tigers, it cannot recognize a tiger.
So today’s video gives us a nice glimpse into how and why that training works, and what its limitations are. What’s really fascinating to me, though, are the images produced by programmer-artist Tom White‘s system.
“I have created a drawing system that allows neural networks to produce abstract ink prints that reveal their visual concepts. Surprisingly, these prints are recognized not only by the neural networks that created them, but also universally across most AI systems which have been trained to recognize the same objects.”
In the video, you’ll see that humans cannot recognize what the AI drew. The rendering is too abstract, too unlike what we see and what we would draw ourselves. Note what White says, though, about other AI systems: they can recognize the object in these AI-produced drawings.
This is, I think, related to what is called adversarial AI, which I’ll discuss in a future post.