I was catching up today on a couple of new-ish developments in reinforcement learning/game-playing AI models.
Meta (which, we always need to note, is the parent company of Facebook) apparently has an entire team of researchers devoted to training an AI system to play Diplomacy, a war-strategy board game. Unlike in chess or Go, a player in Diplomacy must collaborate with others to succeed. Meta’s program, named Cicero, has passed the bar, as explained in a Gizmodo article from November 2022.
“Players are constantly interacting with each other and each round begins with a series of pre-round negotiations. Crucially, Diplomacy players may attempt to deceive others and may also think the AI is lying. Researchers said Diplomacy is particularly challenging because it requires building trust with others, ‘in an environment that encourages players to not trust anyone,’” according to the article.
We can see the implications for collaborations between humans and AI outside of playing games — but I’m not in love with the idea that the researchers are helping Cicero learn how to gain trust while intentionally working to deceive humans. Of course, Cicero incorporates a large language model (R2C2, further trained on the WebDiplomacy dataset) for NLP tasks; see figures 2 and 3 in the Science article linked below. “Each message in the dialogue training dataset was annotated” to indicate its intent; the dataset contained “12,901,662 messages exchanged between players.”
Cicero was not identified as an AI construct while playing in online games with unsuspecting humans. It “apparently ‘passed as a human player,’ in 40 games of Diplomacy with 82 unique players.” It “ranked in the top 10% of players who played more than one game.”
Meanwhile, DeepMind was busy conquering another strategy board game, Stratego, with a new AI model named DeepNash. Unlike Diplomacy, Stratego is a two-player game, and unlike chess and Go, the value of each of your opponent’s pieces is unknown to you — you see where each piece is, but its identifying symbol faces away from you, like cards held close to the vest. DeepNash was trained on self-play (5.5 billion games) and does not search the game tree. Playing against humans online, it ascended to the rank of third among all Stratego players on the platform — after 50 matches.
Apparently the key to winning at Stratego is finding a Nash equilibrium, which I read about at Investopedia, which says: “There is not a specific formula to calculate Nash equilibrium. It can be determined by modeling out different scenarios within a given game to determine the payoff of each strategy and which would be the optimal strategy to choose.”
I was reading an article in Scientific American, found by one of my students, and I came across this passage:
“A Tesla on autopilot recently drove directly toward a human worker carrying a stop sign in the middle of the road, slowing down only when the human driver intervened. The system could recognize humans on their own (which is how they appeared in the training data) and stop signs in their usual locations (as they appeared in the training images) but failed to slow down when confronted by the unfamiliar combination of the two, which put the stop sign in a new and unusual position.”
—Artificial General Intelligence Is Not as Imminent as You Might Think, July 1, 2022 (boldface mine)
The article helpfully linked to a YouTube video, in which we see and hear the situation. The driver is narrating as the car makes its decisions: “All right, we’re having to take over. It’s not slowing early enough. [Pause.] Yep, the car keeps trying to go each time … That’s really unfortunate. It sees the person, it sees the stop sign, but it’s almost not taking it seriously.”
This is not a big surprise if you understand the nature of training data and the long tail — a person walking across a street is common enough, and a stop sign is very common, but a person holding a stop sign and standing (not walking) in the middle of the lane occurs much less frequently than the other two. It’s not rare, but it’s not something we encounter every day while driving.
Here’s the thing: Later in the article, the author says: “You can’t deal with a person carrying a stop sign if you don’t really understand what a stop sign even is.” And at first, I’m like: Cool, cool. That’s good, that’s a nice observation.
And then I thought: Wait a minute. Wait just a minute. Of course an AI system understands nothing, nothing at all. It has been trained to recognize a stop sign. It has been trained to recognize a human (especially a human in the road). But what is really happening in the video? The car is stopping, briefly, and then starting up again. It does this more than once. The driver has to intervene, put a foot on the brake, to stop the car from going forward and hitting the person. The car is behaving the way it was programmed to behave at a stop sign — and not the way it was programmed to behave if a human is walking in front of the car.
The central point here is not that the car’s system doesn’t know what a stop sign is (which, it’s true, it doesn’t). The central point is that given a human holding a stop sign, the system behavior governing a regular, side-of-the-road stop sign has dominated, has come to the fore as the default behavior — and the system behavior that prevents a human from being run over is not in play.
This is no trolley problem. The car’s AI did not decide to kill the human. It did not weigh the options. In an unlikely case (an edge case), it defaulted to the common, everyday case: There is a stop sign. This is what I do when there is a stop sign.
I’m blogging this because this is great discussion material for students and others!
I’m not sure why I put aside the book The Alignment Problem: Machine Learning and Human Values, by Brian Christian (2020), with only two chapters left unread. Probably my work obligations piled up, and it languished for a few months on the “not finished” book stack. It certainly was not due to any fault in the book itself — it’s an excellent study of aspects of AI that are not commonly discussed in the general press. These issues are not obscure or unimportant (quite the opposite), and Christian’s style of storytelling is well suited to explaining them clearly. With my summer waning away, I finally went back and finished reading.
I read and enjoyed his earlier book Algorithms to Live By (co-authored with Tom Griffiths). That book also incorporated stories and anecdotes gleaned through one-on-one interviews. I’m impressed by the immense amount of time and effort that must have gone into this book — apart from all the reading and research that any proper nonfiction book requires, here the author also needed to attend numerous computer science and AI conferences, as well as schedule and complete interviews with dozens of researchers and other experts.
The result is a fascinating exploration of various facets of “the alignment problem,” which is the challenge of ensuring that AI systems are doing what we want them to do, doing what we think they are doing (which isn’t always easy to know), and doing things for the right reasons (that is, mirroring human values rather than, say, turning into HAL from 2001: A Space Odyssey).
The book has three sections, titled Prophecy, Agency, and Normativity, and each section has three chapters. (I don’t like “prophecy” as a stand-in for “prediction” or “probability,” but that’s just me.) The Prophecy section was the most redundant for me and yet still interesting to read.
1. Representation. We begin with Frank Rosenblatt and the perceptron, and how the promise was effectively sabotaged by Minsky and Papert (1969). Straight from there to AlexNet and how it crushed the ImageNet Challenge in 2012. Next, Google Photos labeling a photo of Black people as “gorillas” (2015–18). An excellent history of how the technology of photography misrepresents Black people. Bias, training data, and the research of Joy Buolamwini. From images we move on to language models/word embeddings, still exploring bias. New to me was the significance of reaction time in word-association tests on human subjects — and how “the distance between embeddings in word2vec … uncannily mirrors the human reaction-time data” (p. 45). By the end of this chapter we understand how biases become part of models derived from machine learning, and thus how some representations are grossly inaccurate.
2. Fairness. This chapter focuses largely on the COMPAS product (used in bail, parole, and sentencing decisions in the U.S. justice system), but it begins with the classification of parolees in Illinois in 1927. We see how statistical models were used to make decisions about people in the prison system long before any application of machine learning was possible. What emerges from the discussion of the 2016 ProPublica investigation of COMPAS is that when the base rates for two groups are different (here, the base rate of recidivism for offenders who are white and those who are Black), the risk estimates will skew according to that difference. That means the group with higher recidivism, historically, will be predicted to have a higher risk of recidivism now and in the future. Not fair, right? But if you calibrate the system to account for that, you’re bound to make it unfair in other ways. The COMPAS algorithm is “fair” in that it treats everyone the same according to their demographic’s base rate.
One segment in the chapter looks at data privacy and how removing attributes like race, age, and gender doesn’t actually protect the individual — because those and other attributes can still be derived from other variables (in our online behavior, for example). The term for this: redundant encodings. Summary quote: “Fairness through blindness doesn’t work” (p. 65). Christian notes how fairness, accountability, and transparency went from conference rejections in 2013 to a central research area in machine learning/AI by 2016. The result of all this is closer scrutiny of predictive models that affect people’s lives — scrutiny of what’s really being predicted, and also exactly what data the predictions are based on. In the case of COMPAS, the base rates are really for who gets re-arrested (who gets caught) rather than literally who commits new crimes.
3. Transparency. Beginning with an example from the practice of medicine, this chapter deals with the ability to see why an AI system is doing what it does. First Christian describes a rule-based decision system (good old-fashioned AI; that is, no machine learning involved): if the patient has these symptoms/prior conditions, do x. Then he describes a neural net that was trained on hospital data to recommend which pneumonia patients to admit for care. A researcher noted that the neural net had apparently learned a rule that said people with asthma should be sent home, not admitted. This illustrates how unexpected the pitfalls of training can be — in the past data, people with asthma had a high recovery rate from pneumonia, but that is precisely because they were admitted and received care, not because they were naturally more likely to recover.
The European Union’s GDPR law is discussed. It says EU citizens have the right to know why an algorithmic decision was made — if they were denied a bank loan, for example. This puts a burden on corporations and technology firms that they can’t always bear, because many machine learning systems don’t include any option to examine the components of a recommendation or prediction. It raises the question: If we can’t find out why the system made that recommendation, should we be using that system?
There’s a neat segment comparing human decision-making with decisions from machine algorithms: when using the same data, such as school test scores and class rank, the humans rely heavily on the data but are inconsistent in their recommendations. Research has shown again and again that when humans and machines base decisions on the same data (“codable input variables”), the human decisions are never superior to the machine’s, even in medicine (p. 93). A conclusion is that human experts know which features to look for (to make an assessment) but not how to “do the math.” (We’re too dependent on heuristics.) I was interested to learn that in at least one case, a model showed that data on patients’ medical histories yielded better predictions than data about their current symptoms (p. 101) — this was in a segment about selecting only relevant features and building a simple model, instead of a complex model using all of the possible data. Easier to have transparency in a simple model. However, not all problems allow for simple models.
Having the system generate more outputs is one way to increase transparency. For the pneumonia admission example, the predictions might include the likely cost of treatment and length of hospital stay, not only the likelihood of survival. Another technique for greater transparency is “deconvolution,” which allows researchers to view a visualization of what complex convolutional neural nets for image recognition are “paying attention to” in each of the hidden layers of the net. This can enable researchers to strip out certain layers that appear not to be adding much to the process. At the end of this chapter, Christian explores the idea of interpretability and using a separate computer system to extract the concepts (in a sense) that another system is relying on in making decisions (p. 115; see also this paper). The example given is stripes on a zebra: how important are the stripes to the system’s prediction that an image shows a zebra? On which layer does it account for the stripe pattern?
Now we move on to the the second section of the book, Agency.
4. Reinforcement. Beginning in work with animals and moving on to young humans (Skinnerism), reinforcement learning has a longer history than AI. I liked that Arthur Samuel’s checkers-playing program from the 1950s appears early in this chapter. Cybernetics, feedback, and entropy make an early appearance too. Soon we come to a U.S. Air Force–funded project and nearly 50 years of work by Andrew Barto and Richard Sutton. Mazes, games, scores, points, and the “reward hypothesis.” Christian acknowledges right away that not all decisions in real life have rewards. The connectedness of our choices, the way they change the state of play, the fact that it’s often impossible to know if the best choice was made at any juncture, but many non-optimal choices might still lead to the desired goal in the end. (So much messier than supervised learning with labeled data!) Two parts of the problem: the policy (what to do, when to do it) and the value function (rewards or punishment). Choosing an action means estimating the chances that it will lead to desired outcomes. Intermediate rewards are necessary — the system can’t have only one final payoff, such as winning the game at the end, or it will never learn to make good choices along the way (“learning a guess from a guess,” p. 140). Q-learning derived from Sutton and Barto’s work; it was first demonstrated in a backgammon program in the early 1990s that was “entirely ‘self-taught'” (p. 141) with self-play. Sutton and Barto called it temporal-difference (TD) learning — their algorithm would adjust the value function for future actions based on the result of each new action.
A segment on dopamine (in brains): fewer than 1 percent of our neurons can produce dopamine, but those neurons are connected to millions of others. Release of dopamine is pleasure! There was a mystery in early research: monkeys trained with a light or a bell to expect food would eventually experience a dopamine release at the cue and none at receiving the food itself. TD theory eventually unlocked the mystery: dopamine comes not from the reward itself but from the expectation of the reward. Christian describes this as a fluctuation in the value function: “suddenly the world seemed more promising than it had a moment ago” (p. 143; italics in original). The temporal-difference error arises when, for example, there is no food for the monkey — there is no reward (or a much smaller reward) where one was expected.
Christian goes on to say, “The effect on neuroscience has been transformative” (p.145) — TD theory is now applied in some studies of brain function. After a bit more about neuroscience and measuring (human) happiness, he closes the chapter with the question of how to structure rewards to get the results we want from an algorithmic system (dopamine not included). Kind of funny to think about that in the context of agency — the agent in (machine) reinforcement learning (the program) has no agency where the rewards are concerned.
5. Shaping. This continues the exploration of reinforcement learning. Shaping is a technique for getting the desired behavior using rewards, but specifically by rewarding approximations of the behavior. It originated with B. F. Skinner in a 1940s project involving pigeons. The animal (or machine learning system) is guided toward more and more accuracy via rewards for actions that get closer and closer to the exact behavior. It starts with trial and error, or flailing around and trying everything. The difficulty is when no reward comes, or rewards come too rarely (sparsity) — for example, only one button on a wall of 1,000 identical buttons is the right one to push. So first you give a reward for just pushing any button. Later you give rewards only for pushing buttons near the one correct button. Finally the only reward given is when that one special button is pushed. Thus the learner learns to push only that button, every time.
Christian gives the example of how video games subtly teach us how to play them, which I love, because I am continually impressed at how good some games are at training us through play, without instructions. There’s also the principle of training first with easier versions of the task — learn to catch a big, lightweight ball before trying to catch a baseball for the first time; learn to hit a slow pitch before you try a fast pitch. Christian calls this curriculum. He also refers to animal-training techniques developed by Marian Breland Bailey and her first husband, Keller Breland (their story is told in an open-access article from 2005). Determining the intermediate steps (what are the best early tasks?) is not trivial. Video games pose challenges on each level that are achievable but also, often, at the outer limit of what we’re able to do at that point in the game.
“What makes games so hypercompelling is how well shaped they are. The levels are a perfect curriculum.”
—Brian Christian (p. 175)
Apart from curriculum, you might only use the full task or problem, but build in lots of rewards at the start, like the button-pushing example above. This is the incentives technique. Gradually the incentives are changed to be more centered on the actual goal. Poor outcomes can result when the subject finds ways to get the reward without progressing toward the ultimate goal. “Rewarding A while hoping for B” can backfire (p. 164). One principle is to give the reward for the state of the game, or environment, rather than for the action performed. So pushing the same button repeatedly gets no additional rewards, or kicking a ball such that it lands farther from the goal is punished with a point deduction.
At the end of this chapter, Christian discusses evolution (where the “reward” is survival of the species), the “optimal reward problem” (the reward desired by the designer might not be the same as the reward assigned to the agent), and incentivization in real life, or gamification.
6. Curiosity. Origin story of the Arcade Learning Environment, which encoded hundreds of old Atari games into a single package that any researcher could use for training an AI system: This was a milestone because previously researchers had created their own games for training purposes, and there was no consistency. ALE, like the ImageNet dataset, allowed for comparisons among systems that had used the same dataset to learn. DeepMind put a convolutional neural net and Q-learning to the task, with excellent results on many of the Atari games (notably Breakout). A key accomplishment from using a ConvNet (or CNN) was that the neural net determined which features were important in each game (article, 2015). The game on which the DeepMind system was least successful, Montezuma’s Revenge, was the type where the player has to explore a large environment and solve puzzles to enter new rooms. The rewards are sparse, and the player dies often.
To solve a game like Montezuma’s Revenge, a player needs to be curious and intrinsically motivated. You aren’t just shooting things and racking up points. Being motivated by curiosity — a desire to find out what comes next, or how something works — is much harder to simulate in a machine than the desire to get a more tangible reward. What sparks curiosity? New situations (novelty) and surprise (the unexpected), among other things.
A system that performed much better on Montezuma’s Revenge was one with an added “density model” that contained all previously encountered views of the game environment. The model yields a prediction of how unfamiliar — or novel — the current view is (compared with all the past views); the agent is rewarded for finding novel views, thus incentivizing getting out of the same-old, same-old and into a new room or level in the game (p. 192).
Surprise can mean you encountered an unexpected result, not just a new location. It’s tricky with reinforcement learning because you want the agent to learn that a particular action is “good” (and gets a reward) so that the action will be repeated. But you also want the agent to discover new actions, or a new context for an action — so you also build in rewards for these discoveries. Using this rationale, a team from OpenAI developed the random network distillation (RND) bonus, which resulted in a system that actually completed Montezuma’s Revenge (paper, 2018).
It turns out that giving points for intrinsic rewards can yield better results (at least in video games) than points for the usual (extrinsic) stuff, like shooting bad guys and collecting gold nuggets.
In a segment about boredom and addiction, we learn that intrinsically motivated agents sometimes just give up when they are stuck, like humans. We also find out that novelty and surprise elicit dopamine release — of course, since they seem to promise something interesting coming up.
Normativity: Learning the norms
The final section, Normativity, held the most new material for me.
7. Imitation. Humans are great imitators, almost from birth. If we learn by imitating, why shouldn’t machines? The first hurdle to consider is over-imitation, which is including unnecessary actions or steps that were in the exemplar; human children recognize these as unnecessary but might attribute intentionality to them. Advantages of learning by imitation include efficiency, possible greater safety, and learning things that are hard to describe in words — showing instead of telling. (If you can’t describe all the steps, how could you program them in code?) Examples of early self-driving vehicles are discussed. A big challenge is learning how to recover from mistakes if the exemplar never made any mistakes. Another is new situations that were never demonstrated. A third is “cascading errors,” which can arise from the previous two challenges. A solution is to put the exemplar, or teacher, back in the loop. Step in, take the wheel when things start to go wrong, and the machine system learns what to do in those situations too.
There’s a lot to be considered in what is demonstrated, what is shown or enacted that we want the machine to imitate — the core of the alignment problem. A human performing a task such as driving a car might be considering possible outcomes of current actions, and act accordingly, but those considerations are opaque to any observer, including an AI system learning by imitation. Developers can choose to emphasize, or encode, either the expected reward(s) from an action or a value based on all possible rewards, whether rare or common. (Do we want the machine to assume people usually do not step off the curb into the path of a moving car?)
Then we come to self-play, which was part of Arthur Samuel’s checkers-playing program and later a key to the success of AlphaGo Zero. If the system (encoded with the rules of the game, forbidden moves, etc.) plays itself, not only can it improve more rapidly than by playing human opponents; it can also exceed the skill levels of its own programmers. Limited to imitation, the system might never progress beyond what it has been shown. Christian describes the functions of AlphaGo Zero’s “policy network” during self-play, adding that this machine learning process is called amplification.
Imitation and learning values/policy in the wild seem like the way to go when the task is too complex to explain in detail, to code out completely. Life is not a board game, however. The rules themselves are too complex, based in morality and ethics — human values.
“In the moral domain … it is less clear how to extend imitation, because no such external metric exists.”
—Brian Christian (p. 247)
8. Inference. Humans (even very young humans) can figure out that someone needs help. We can infer others’ goals. We can work together, collaborate, without having every step spelled out for us. Christian says researchers are looking at inference as a way to instill human values in machine systems, using inverse reinforcement learning (IRL). The system needs to infer the reward, from observing the demonstrated behavior, instead of learning the behavior because of getting a reward. Christian calls it “one of the seminal and critical projects in twenty-first century AI” (p. 255).
The system doesn’t need to name the reward (the goal), but it’s got to learn how to reach that goal without ever being told what the goal is, without receiving a designated reward. Examples concern Andrew Ng’s work with autonomous helicopters (large, expensive ones, but not large enough to carry a human) around 2008 (details and video), in which a system learned to perform a difficult trick move, the chaos: a pirouetting flip in which the axis changes throughout, such that the helicopter makes a sphere in the air. Very few human experts who fly these model choppers can successfully complete the maneuver. The key idea here is that the system could infer what the human operator wanted (the goal) even though there were repeated failures and few successes. Note, the system also learned to complete more basic maneuvers that had not been achieved by earlier ML systems. A good explanation from 2018: Learning from humans: what is inverse reinforcement learning?
Another example is kinesthetic teaching, in which a system infers the goal from observing the movements of a robot arm that is controlled by a human. This kind of observing doesn’t mean watching, with machine vision, but rather experiencing the movements. I was reminded of this video (start at 2:32), in which a small robot arm constructs a model of itself by “flailing” — trying out all the possible ways it can move itself. (Although that’s not the same as having a human move the arm through the prescribed motions, it is a way to enable the robot to learn its own capabilities.)
Without an expert on hand to perform the tasks we want the robot to learn, we might train the system using feedback from an observer. Christian describes a groundbreaking AI safety project from 2017 in which the system would send random video clips to its human “evaluators,” and one of two video clips would be tagged as better than the other — like saying, “Here, you’re on the right track.” Again, a key aspect of this reinforcement training is that no score exists. Unlike playing a video game, the system cannot rack up points. I think it’s important to mention that the “robots” are performing within a simulation, with gravity and so on in force, so the video clips are recordings of the simulated robot in the simulated environment. Using this method, a robot was successfully trained to perform a backflip (paper).
This is exciting — if an AI system can learn “best behavior” by flailing and receiving feedback from human observers, maybe it’s possible to train for different kinds of tasks for which it would be impossible to write out explicit rules.
Cooperative inverse reinforcement learning (CIRL) takes into account the machine working with the human. This is almost super-alignment, because it’s not about getting the AI system to have your goal for itself but rather to achieve the goal you want for yourself. Christian’s effective example is a person reaching for a thing that is out of reach (me, with the high shelves in the supermarket): a robot doesn’t need to want the thing you’re reaching for. It should recognize its goal as getting for you what you can’t get for yourself. To achieve this, we’re going to need to deliberately teach the systems. “The insights of pedagogy and parenting are being quickly taken up by computer scientists,” Christian says (p. 270). This kind of learning also requires more interaction, more back-and-forth between the humans and the machine.
Christian raises a concern regarding these paired systems, our possible robot or software helpmates of the future: Not everything we want is good for us — and not everything a corporation wants us to do is good for us. Alignment with our desires might not be in alignment with our best interests.
9. Uncertainty. This chapter begins with the frightening story of a near-disaster in 1983, when a Soviet lieutenant colonel made a very human judgment call and likely saved the world from nuclear annihilation. The point is that computer systems (such as … nuclear-warning systems) are not perfect, and human intuition has more than once averted catastrophe.
There’s a relationship between adversarial attacks on image-recognition systems and the ability of researchers to create digital images that (to humans) show only a jumble of random pixels but that an AI system “recognizes” with 99 percent certainty as an ostrich, or a stop sign. Both types of error happen because the ML training process produces an ability to recognize patterns of pixels. The “open category problem” refers to the training process for these systems: They are trained to “recognize” some number of things in digital images — say 1,000 things, or 10,000 — but there are millions of things in the world. An image-recognition system is going to give you its best guess, but it only “knows” the things it was trained to know.
Getting the system to admit it does not know what a thing is — this is a kind of frontier in today’s research. If your system is recognizing pre-cancerous moles and you give it a photo of a pizza, you want it to say, “That’s probably not a mole at all.”
Bayesian neural networks were explored and sort of abandoned in the 1980s and ’90s because they couldn’t scale. Instead of a fixed weight on a connection for each unit in the neural net (as in a non–Bayesian NN), there would be a range of values for each weight. Because of the range, you might get a different output for the same input after training was completed. If most outputs matched, reliability would be high. Varied outputs (disagreement) would be akin to the system saying, “I don’t know what that is.” Note, researchers can model Bayesian NNs by training separate conventional models, separately. They run the same input through each model and compare the outputs. A lack of matching outputs: “We don’t know what that is.” A group of models like this is an ensemble. But — you don’t really need separately trained models (if I’m understanding this segment correctly); all you need to do is disable some layers of the trained NN, get your output, and then disable different layers and run the data again. This technique, known as dropout, goes all the way back to AlexNet in 2012 (p. 285). Apparently it’s just as effective for reporting uncertainty as a bona fide Bayesian NN. Christian calls it a dropout-based uncertainty measure, and it’s been effective for recognizing unhealthy human retinas and for regulating the speed of autonomous vehicles (in case of uncertainty, slow down).
When uncertainty is present, we want systems to be cautious. Some decisions are more weighty than others; Christian discusses the interpretation of a “do not resuscitate” order. (If in doubt, resuscitate.) He characterizes the challenge as “measuring impact,” and again we’re looking at a very human kind of judgment call, based on human experiences, ethics, and so on. What would be the impact of a bad call? This segment made me think of the Prime Directive in Star Trek. (Starfleet personnel are forbidden to interfere with the natural development of alien civilizations. It’s already interference if you’ve landed on their planet!) There’s also the question of whether the result is reversible (irreversible == higher impact, but some irreversible actions are trivial, e.g. the apple is gone after you’ve eaten it). Keeping options open can be important — and how do we train the AI system to see the options and choose among them? I loved the references to Sokoban, a game I’ve played on many different platforms — but like so many other toy examples, it’s ridiculously simple compared to the real world. See AI Safety Gridworlds (2017).
Then we come to intervention. If anything goes wrong with an artificial intelligence system (running amok!), we must be able to intervene, right? (See nuclear near-disaster, above.) This is called corrigibility. But pulling the plug is not the answer. That’s why this is in the Uncertainty chapter — ideally, the system would shut itself down if necessary, and if “necessary” is in question, alert the humans. This goes into an almost chicken-and-egg situation: Will the machine let the human intervene? What if the human should not be permitted to intervene? What if the machine’s uncertainty is too low? Too high? What if the humans’ end goals are not entirely clear? (Protect the world at all costs, or protect the Soviet Union and to hell with everyone else?)
Now the difficulty in designing explicit reward functions becomes life-or-death. If the goal is protect the Soviet Union (and there’s no human intervention), we’re all dead. AI researchers are trying, essentially, to model the intuition of the human lieutenant colonel who reasoned that it was highly unlikely that the U.S. had fired five nuclear missiles at his country at that time. Uncertainty and the imperfection of the world and the infinite number of possible situations that might arise. Researchers working on inverse reward design (IRD) are allowing the system to second-guess the goals.
The final segment of the chapter, “Moral Uncertainty,” looks at the question, “What is the right thing to do when you don’t know the right thing to do?” Given the example of sin in religious belief systems, sometimes the rule is crystal clear, and sometimes it’s not. Turns out there’s a book (open access). Christian has gone off the rails into philosophy here, although it’s certainly interesting. I liked the final two pages where the philosopher Nick Bostrom came up, offering reasons why this seemingly esoteric stuff is not mere navel-gazing but actually important.
The book’s Conclusion is unusual in that it is a kind of extension of several of the individual chapters — not a summary so much as “Here’s what to look out for, in the future.” My takeaways are that more and more researchers are focusing on AI ethics and safety, which must be a good thing; the world is continuously changing, so AI models will always need updating; this book was published in 2020, and how can I keep up with what has happened in this field since then?
I think it’s tremendously important for more people to have more understanding of what’s going on with AI development — not just products and threats and dangers, but what questions the researchers are asking and how they are trying to find answers.
Three factors that go into creating a new machine learning model:
Asking the right question: 25 percent
Data exploration and cleaning, feature engineering, feature selection: 50 percent
Training and evaluating the model: 25 percent
That’s according to Ellen Ambrose, director of AI at Protenus, a healthcare startup based in Baltimore, Maryland, and founded in 2014. She has a Ph.D. in neuroscience from Johns Hopkins University.
The article says: “Once a team has identified the right questions and has determined that the available data can answer those questions, the model needs to be configured.” I think this needs some reexamination: Once the team thinks they have the right question(s), and they think it’s likely that the available data can answer the question(s). Or even this: They think it’s likely that the available data can answer the question(s) adequately and with a high degree of accuracy.
Just as interface design has to be part of product development from the very beginning, a critical approach to the consequences of AI models must be in effect at every stage of the process. (The article doesn’t say this. This is me.) Your question might be harmful in ways you have not yet realized or acknowledged. Your data might contain imbalances or inadequacies that will not be apparent until you run wild data through your model.
The article lists three types of machine learning systems from which you might choose:
Support vector machine (SVM)
Gradient boosted forest*
* Wikipedia says: “Gradient boosting is a machine learning technique used in regression and classification tasks, among others. It gives a prediction model in the form of an ensemble of weak prediction models, which are typically decision trees. When a decision tree is the weak learner, the resulting algorithm is called gradient-boosted trees; it usually outperforms random forest” (source). This is new to me; I’ve learned about random forests but not gradient boosted forests.
“Some algorithms are just better suited for certain tasks,” the article says. I wonder whether this might be glossed over in some quickie data science boot camps and courses. If you’re using a tool to build a machine learning model, and the tool lets you choose from various options, do you know enough to choose the one best suited to your task?
Like many descriptions of training an ML model, this article briefly glides over the process of adjusting hyperparameters. It always bothers me when a few principles of statistics and probability are dropped into an article about machine learning as if they were not part of an entire field that existed before ML. The text quickly moves on to: Now that your model is trained, it’s ready to go!
A bit later in the article, it’s noted that we can invoke a a trained ML model “with a single Python call” in a Jupyter Notebook. This is in fact how many students supposedly “learn” machine learning — but what are they learning, really, when they simply plug a given dataset into a pre-existing model? I’d say it’s like using the cookie dough sold in the refrigerated section of the supermarket. Sure, you get fresh hot cookies from your oven, but what do you know about making cookies? (What’s in that pre-made dough?)
The idea that someone else has built, tested, trained this ML model (many someones, in fact, with tons of resources you don’t have), and now you can skip all that and just use the model to do what you need to do — sure, that seems great! “The developer could then apply a specific business logic to generate value from an idea without needing to worry about the details of how the model was built and trained,” the article says. Wonderful!
But … you know, that cookie dough could contain an ingredient you’re allergic to. You’re going to want to read the label carefully. Does that ML model you’re using have an ingredients label? Chances are, the answer is no. What’s inside the black box? We can always go back to Crawford and Paglen for an example of how badly this can go. Or look at the many examples Hannah Fry examined.
Unusually for an article of this kind, here we find an acknowledgment that real-world data is constantly changing (and increasing). The ML system — once deployed — will need tending and oversight. MLOps was a new term to me — inspired by DevOps (the collaboration among developers and IT professionals in all stages of the software development lifecycle), the term MLOps refers to collaboration among data scientists, ML engineers, developers and IT professionals to manage the lifecycle of a machine learning system (algorithms and hardware). How well is it working now that it’s out in the world, and its output is relied on for decision-making that affects real people’s lives?
At the end, the article promises that Ars Technica “will be running an entire series on creating, evaluating, and running AI models.” I’m going to be on the lookout for that, but I’m a little skeptical after reading this one. There’s nothing wrong in it, per se, but in my opinion it misrepresents the use of ML models as something simple and safe, ignoring all the ways that your lack of knowledge about the details can lead to flawed results and unreliable outcomes.
A fair number of people don’t know much about the role of graphics-processing hardware in the success of neural networks. A neural network is a collection of algorithms, but to crunch through the massive quantities of training data required by many AI systems — and get it done in a reasonable timeframe, instead of, say, years — you need both speed and parallelism. The term for this kind of computer-chip technology is accelerated computing, and Nvidia is the market leader.
Nvidia has ridden this wave to a current market value of $505 billion, according to the article. Five years ago, it was $31 billion. Nvidia both designs and manufactures the semiconductors for which it is famous. The original purpose of these chips was to run the graphics in modern computer games — the ones where characters race through immense, detailed 3D worlds. About half of Nvidia’s revenue still comes from chips designed for running game software.
“Huge, real-time models like those used for speech recognition or content recommendation increasingly need specialized GPUs to perform well, says Ian Buck, head of Nvidia’s accelerated-computing business.”
— Will Nvidia’s huge bet on artificial-intelligence chips pay off?
So what’s the “huge bet”? Nvidia is in the midst of acquiring Arm, a designer of other kinds of fast chips, which also have the appeal of being energy efficient. The deal may or may not go through — there are European and U.K. hurdles to leap (Arm is based in the U.K.). Essentially Nvidia seeks to expand its microprocessor repertoire. The article discusses the competition among chip firms such as Intel and Advanced Micro Devices (AMD) — and increasingly, the biggest tech firms (e.g. Google and Amazon/AWS) are getting into the chip-design business as well.
The Economist also produced a podcast episode about Nvidia and GPUs around the same time it published the article summarized above: Shall we play a game? How video games transformed AI (38 min.). It provides a friendly, low-stress introduction to neural networks and deep learning, going back to the perceptron, and covering the dominance in AI research of symbolic systems until the late 1980s. That’s the first 10 minutes. Then video games come into focus, and how so much technology innovation has come from computer game developments. Difference between CPUs and GPUs: around 13:00. Details about Nvidia’s programmable GPUs. Initial resistance (from research scientists) to using GPUs for serious AI work: around 20:00. Skepticism toward neural networks in the early 2000s. Andrew Ng’s group at Stanford demonstrates amazing speed increases in training time, using Nvidia GPUs. ImageNet challenge, AlexNet, the new rise of neural networks. In the final minutes, Nvidia’s future, chip technologies, and stock prices are discussed.
Sometimes people make a statement that an artificial intelligence system is a computer system that learns, or that learns on its own.
That is inaccurate. Machine learning is a subset of artificial intelligence, not the whole field. Machine learning systems are computer systems that learn from data. Other AI systems do not. Various systems are wholly programmed by humans to follow explicit rules and do not generate any code or instructions on their own.
The error probably arises from the fact that many of the exciting advances in AI since 2012 have involved some form of machine learning.
The recent successes of machine learning have much to do with neural networks, each of which is a system of algorithms that (in some respects) mimics the way neurons work in the brains of humans and other animals — but only in some respects. In other words, a neural network shares some features with human brains, but is not extremely similar to a human brain in all its complexity.
Advances in neural networks have been made possible not only by new algorithms (written by humans) but also by new computer hardware that did not exist in the earlier decades of AI development. The main advance concerns graphical processing units, commonly called GPUs. If you’ve noticed how computer games have evolved from simple flat pixel blocks (e.g. Pac-Man) to vast 3D worlds through which the player can fly or run at high speed, turning in different directions to view vast new landscapes, you can extrapolate how the advanced hardware has increased the speed of processing of graphical information by many orders of magnitude.
Without today’s GPUs, you can’t create a neural network that runs multiple algorithms in parallel fast enough to achieve the amazing things that AI systems have achieved. To be clear, the GPUs are just engines, powering the code that creates a neural network.
Another reason why AI has leapt onto the public stage recently is Big Data. Headlines alerted us to the existence and importance of Big Data a few years ago, and it’s tied to AI because how else could we process that ginormous quantity of data? If all we were doing with Big Data was adding sums, well, that’s no big deal. What businesses and governments and the military really want from Big Data, though, is insights. Predictions. They want to analyze very, very large datasets and discover information there that helps them control populations, make greater profits, manage assets, etc.
Big Data became available to businesses, governments, the military, etc., because so much that used to be stored on paper is now digital. As the general population embraced digital devices for everyday use (fitness, driving cars, entertainment, social media), we contributed even more data than we ever had before.
Very large language models (an aspect of AI that contributes to Google Translate, automatic subtitles on YouTube videos, and more) are made possible by very, very large collections of text that are necessary to train those models. Something I read recently that made an impression on me: For languages that do not have such extensive text corpuses, it can be difficult or even impossible to train an effective model. The availability of a sufficiently enormous amount of data is a prerequisite for creating much of the AI we hear and read about today.
If you ever wonder where all the data comes from — don’t forget that a lot of it comes from you and me, as we use our digital devices.
Perhaps the biggest misconception about AI is that machines will soon become as intelligent as humans, or even more intelligent than all of us. As a common feature in science fiction books and movies, the idea of a super-intelligent computer or robot holds a rock-solid place in our minds — but not in the real world. Not a single one of the AI systems that have achieved impressive results is actually intelligent in the way humans (even baby humans!) are intelligent.
The difference is that we learn from experience, and we are driven by curiosity and the satisfaction we get from experiencing new things — from not being bored. Every AI system is programmed to perform particular tasks on the data that is fed to it. No AI system can go and find new kinds of data. No AI system even has a desire to do so. If a system is given a new kind of data — say, we feed all of Wikipedia’s text to a face-recognition AI system — it has no capability to produce meaningful outputs from that new kind of input.
A couple of days ago, I wrote about Kaggle’s free introductory Python course. Then I started the next free course in the series: Intro to Machine Learning. The course consists of seven modules; the final module, like the last module in the Python course, shows you how to enter a Kaggle competition using the skills from the course.
The first module, “How Models Work,” begins with a simple decision tree, which is nice because (I think) everyone can grasp how that works, and how you add complexity to the tree to get more accurate answers. The dataset is housing data from Melbourne, Australia; it includes the type of housing unit, the number of bedrooms, and most important, the selling price (and other data too). The data have already been cleaned.
In the second module, we load the Python Pandas library and the Melbourne CSV file. We call one basic statistics function that is built into Pandas — describe() — and get a quick explanation of the output: count, mean, std (standard deviation), min, max, and the three quartiles: 25%, 50% (median), 75%.
When you do the exercise for the module, you can copy and paste the code from the lesson into the learner’s notebook.
The third module, “Your First Machine Learning Model,” introduces the Pandas columns attribute for the dataframe and shows us how to make a subset of column headings — thus excluding any data we don’t need to analyze. We use the dropna() method to eliminate rows that have missing data (this is not explained). Then we set the prediction target (y) — here it will be the Price column from the housing data. This should make sense to the learner, given the earlier illustration of the small decision tree.
y = df.Price
We use the previously created list of selected column headings (named features) to create X, the features of each house that will go into the decision tree model (such as the number of rooms, and the size of the lot).
X = df[features]
Then we build a model using Python’s scikit-learn library. Up to now, this will all be familiar to anyone who’s had an intro-to-Pandas course, particularly if the focus was data science or data journalism. I do like the list of steps given (building and using a model):
Define: What type of model will it be? A decision tree? Some other type of model? Some other parameters of the model type are specified too.
Fit: Capture patterns from provided data. This is the heart of modeling.
Since fit() and predict() are commands in scikit-learn, it begins to look like machine learning is just a walk in the park! And since we are fitting and predicting on the same data, the predictions are perfect! Never fear, that bubble will burst in module 4, “Model Validation,” in which the standard practice of splitting your data into a training set and a test set is explained.
First, though, we learn about predictive accuracy. Out of all the various metrics for summarizing model quality, we will use one called Mean Absolute Error (MAE). This is explained nicely using the housing prices, which is what we are attempting to predict: If the house sold for $150,000 and we predicted it would sell for $100,000, then the error is $150,000 minus $100,000, or $50,000. The function for MAE sums up all the errors and returns the mean.
This is where the lesson says, “Uh-oh! We need to split our data!” We use scikit-learn’s train_test_split() method, and all is well.
MAE shows us our model is pretty much crap, though. In the fifth module, “Underfitting and Overfitting,” we get a good explanation of the title topic and learn how to limit the number of leaf nodes at the end of our decision tree — DecisionTreeRegressor(max_leaf_nodes).
After all that, our model’s predictions are still crap — because a decision tree model is “not very sophisticated by modern machine learning standards,” the module text drolly explains. That leads us to the sixth module, “Random Forests,” which is nice for two reasons: (1) The explanation of a random forest model should make sense to most learners who have worked through the previous modules; and (2) We get to see that using a different model from scikit-learn is as simple as changing
my_model = DecisionTreeRegressor(random_state=1)
my_model = RandomForestRegressor(random_state=1)
Overall I found this a helpful course, and I think a lot of beginners could benefit from taking it — depending on their prior level of understanding. I would assume at least a familiarity with datasets as CSV files and a bit more than beginner-level Python knowledge.
Hello, Python: A quick introduction to Python syntax, variable assignment, and numbers
Functions and Getting Help: Calling functions and defining our own, and using Python’s builtin documentation
Booleans and Conditionals: Using booleans for branching logic
Lists: Lists and the things you can do with them. Includes indexing, slicing and mutating
Loops and List Comprehensions: For and while loops, and a much-loved Python feature: list comprehensions
Strings and Dictionaries: Working with strings and dictionaries, two fundamental Python data types
Working with External Libraries: Imports, operator overloading, and survival tips for venturing into the world of external libraries
Even though I’m an intermediate Python coder, I skimmed all the materials and completed the seven problem sets to see how they are teaching Python. The problems were challenging but reasonable, but the module on functions is not going to suffice for anyone who has little prior experience with programming languages. I see this in a lot of so-called introductory materials — functions are glossed over with some ready-made examples, and then learners have no clue how returns work, or arguments, etc.
At the end of the course, the learner is encouraged to join a Kaggle competition using the Titanic passengers dataset. However, the learner is hardly prepared to analyze the Titanic data at this point, so really this is just an introduction to how to use files provided in a competition, name your notebook, save your work, and submit multiple attempts. The tutorial gives you all the code to run a basic model with the data, so it’s really more a demo than a tutorial.
The basic idea: Immediately detect and remove hateful or dangerous posts in social media and other online forums. With advances in natural language processing (NLP), identification of harmful speech becomes more accurate and more practical.
In this essay published in Scientific American (2021), researchers from the private company Unitary (see their public Detoxify code on GitHub) discuss the challenges in rating the level of toxicity or harmfulness in text content. One aspect is what is considered harmful: profanity is easy to detect; misinformation is complicated. Another aspect: Terms describing gender, race, or ethnicity can be used hatefully or as (non-toxic) self-description.
(I’ve written before about machine learning used in comment moderation, which is a large concern in media companies that permit users to post comments on articles and blog posts.)
Jigsaw, a Google division, “released two public data sets containing over one million toxic and non-toxic comments from Wikipedia and a service called Civil Comments.” Each comment was labeled with a rating such as “Toxic” or “Very Toxic.” The data sets were used as training data in three competitions, hosted by Google, in which AI researchers could enter their trained models and see how they compared to others (and win money). The three “Jigsaw challenges” (one per year):
“We decided to take inspiration from the best Kaggle solutions and train our own algorithms with the specific intent of releasing them publicly.”
— Unitary researchers
The Unitary researchers describe Detoxify, “an open-source, user-friendly comment detection library,” which is intended “to help researchers and practitioners identify potential toxic comments.” The library includes three separate models, one for each Jigsaw challenge. These models can be fine-tuned using additional data sets.
One particular limitation pointed out by the researchers is that a high toxicity score does not always indicate actually toxic content: “As an example, the sentence ‘I am tired of writing this stupid essay’ will give a toxicity score of 99.7 percent, while removing the word ‘stupid’ will change the score to 0.05 percent.”
There’s still a long way to go before harmful comments and social media posts can be instantly removed from platforms.