learning – AI in Media and Society

AI literacy for everyone

April 16, 2022April 17, 2022 Mindy McAdams

My university has undertaken a long-term initiative called “AI across the curriculum.” I recently saw a presentation that referred to this article: Conceptualizing AI literacy: An exploratory review (2021; open access). The authors analyzed 30 publications (all peer-reviewed; 22 conference papers and eight journal articles; 2016–2021). Based in part on their findings, my university proposes to tag each AI course as fitting into one or more of these categories:

Know and understand AI
Use and apply AI
Evaluate and create AI
AI ethics

“Most researchers advocated that instead of merely knowing how to use AI applications, learners should learn about the underlying AI concepts for their future careers and understand the ethical concerns in order to use AI responsibly.”
— Ng, Leung, Chu and Qiao (2021)

AI literacy was never explicitly defined in any of the articles, and assessment of the approaches used was rigorous in only three of the studies represented among the 30 publications. Nevertheless, the article raises a number of concerns for education of the general public, as well as K–12 students and non–computer science students in universities.

Not everyone is going to learn to code, and not everyone is going to build or customize AI systems for their own use. But just about everyone is already using Google Translate, automated captions on YouTube and Zoom, content recommendations and filters (Netflix, Spotify), and/or voice assistants such as Siri and Alexa. People in far more situations than they know are subject to face recognition, and decisions about their loans, job applications, college admissions, health, and safety are increasingly affected (to some degree) by AI systems.

That’s why AI literacy matters. “AI becomes a fundamental skill for everyone” (Ng et al., 2021, p. 9). People ought to be able to raise questions about how AI is used, and knowing what to ask, or even how to ask, depends on understanding. I see a critical role for journalism in this, and a crying need for less “It uses AI!” cheerleading (*cough* Wall Street Journal) and more “It works like this” and “It has these worrisome attributes.”

In education (whether higher, secondary, or primary), courses and course modules that teach students to “know and understand AI” are probably even more important than the ones where students open up a Google Colab notebook, plug in some numbers, and get a result that might seem cool but is produced as if by sorcery.

Five big ideas about AI

This paper led me to another, Envisioning AI for K-12: What Should Every Child Know about AI? (2019, open access), which provides a list of five concise “big ideas” in AI:

“Computers perceive the world using sensors.” (Perceive is misleading. I might say receive data about the world.)
“Agents maintain models/representations of the world and use them for reasoning.” (I would quibble with the word reasoning here. Prediction should be specified. Also, agents is going to need explaining.)
“Computers can learn from data.” (We need to differentiate between how humans/animals learn and how machines “learn.”)
“Making agents interact comfortably with humans is a substantial challenge for AI developers.” (This is a very nice point!)
“AI applications can impact society in both positive and negative ways.” (Also excellent.)

Each of those is explained further in the original paper.

The “big ideas” get closer to a general concept for AI literacy — what does one need to understand to be “literate” about AI? I would argue you don’t need to know how to code, but you do need to understand that code is written by humans to tell computer systems what to do and how to do it. From that, all kinds of concepts stem; for example, when “sensors” (cameras) send video into the computer system, how does the system read the image data? How different is that from the way the human brain processes visual information? Moreover, “what to do and how to do it” changes subtly for machine learning systems, and I think first understanding how explicit a non–AI program needs to be helps you understand how the so-called learning in machine learning works.

A small practical case

A colleague who is a filmmaker recently asked me if the automated transcription software he and his students use is AI. I think this question opens a door to a low-stakes, non-threatening conversation about AI in everyday work and life. Two common terms used for this technology are automatic speech recognition (ASR) and speech-to-text (STT). One thing my colleague might not realize is that all voice assistants, such as Siri and Alexa, use a version of this technology, because they cannot “know” what a person has said until the sounds are transformed into text.

The serious AI work took place before there was an app that filmmakers and journalists (and many other people) routinely use to transcribe interviews. The app or product they use is plug-and-play — it doesn’t require a powerful supercomputer to run. Just play the audio, and text is produced. The algorithms that make it work so well, however, were refined by an impressive amount of computational power, an immense quantity of voice data, and a number of computer scientists and engineers.

So if you ask whether these filmmakers and journalists “are using AI” when they use a software program to automatically transcribe the audio from their interviews, it’s not entirely wrong to say yes, they are. Yet they can go about their work without knowing anything at all about AI. As they use the software repeatedly, though, they will learn some things — such as, the transcription quality will be poorer for voices speaking English with an accent, and often for people with higher-pitched voices, like women and children. They will learn that acronyms and abbreviations are often transcribed inaccurately.

The users of transcription apps will make adjustments and carry on — but I think it would be wonderful if they also understood something about why their software tool makes exactly those kinds of mistakes. For example, the kinds of voices (pitch, tone, accents, pronunciation) that the system was trained on will affect whose voices are transcribed most accurately and whose are not. Transcription by a human is still preferred in some cases.

AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.

Explaining common misconceptions about AI

August 27, 2021August 27, 2021 Mindy McAdams

Sometimes people make a statement that an artificial intelligence system is a computer system that learns, or that learns on its own.

That is inaccurate. Machine learning is a subset of artificial intelligence, not the whole field. Machine learning systems are computer systems that learn from data. Other AI systems do not. Various systems are wholly programmed by humans to follow explicit rules and do not generate any code or instructions on their own.

The error probably arises from the fact that many of the exciting advances in AI since 2012 have involved some form of machine learning.

The recent successes of machine learning have much to do with neural networks, each of which is a system of algorithms that (in some respects) mimics the way neurons work in the brains of humans and other animals — but only in some respects. In other words, a neural network shares some features with human brains, but is not extremely similar to a human brain in all its complexity.

Advances in neural networks have been made possible not only by new algorithms (written by humans) but also by new computer hardware that did not exist in the earlier decades of AI development. The main advance concerns graphical processing units, commonly called GPUs. If you’ve noticed how computer games have evolved from simple flat pixel blocks (e.g. Pac-Man) to vast 3D worlds through which the player can fly or run at high speed, turning in different directions to view vast new landscapes, you can extrapolate how the advanced hardware has increased the speed of processing of graphical information by many orders of magnitude.

Without today’s GPUs, you can’t create a neural network that runs multiple algorithms in parallel fast enough to achieve the amazing things that AI systems have achieved. To be clear, the GPUs are just engines, powering the code that creates a neural network.

More about the role of GPUs in today’s AI: Computational Power and the Social Impact of Artificial Intelligence (2018), by Tim Hwang.

Another reason why AI has leapt onto the public stage recently is Big Data. Headlines alerted us to the existence and importance of Big Data a few years ago, and it’s tied to AI because how else could we process that ginormous quantity of data? If all we were doing with Big Data was adding sums, well, that’s no big deal. What businesses and governments and the military really want from Big Data, though, is insights. Predictions. They want to analyze very, very large datasets and discover information there that helps them control populations, make greater profits, manage assets, etc.

Big Data became available to businesses, governments, the military, etc., because so much that used to be stored on paper is now digital. As the general population embraced digital devices for everyday use (fitness, driving cars, entertainment, social media), we contributed even more data than we ever had before.

Very large language models (an aspect of AI that contributes to Google Translate, automatic subtitles on YouTube videos, and more) are made possible by very, very large collections of text that are necessary to train those models. Something I read recently that made an impression on me: For languages that do not have such extensive text corpuses, it can be difficult or even impossible to train an effective model. The availability of a sufficiently enormous amount of data is a prerequisite for creating much of the AI we hear and read about today.

If you ever wonder where all the data comes from — don’t forget that a lot of it comes from you and me, as we use our digital devices.

Perhaps the biggest misconception about AI is that machines will soon become as intelligent as humans, or even more intelligent than all of us. As a common feature in science fiction books and movies, the idea of a super-intelligent computer or robot holds a rock-solid place in our minds — but not in the real world. Not a single one of the AI systems that have achieved impressive results is actually intelligent in the way humans (even baby humans!) are intelligent.

The difference is that we learn from experience, and we are driven by curiosity and the satisfaction we get from experiencing new things — from not being bored. Every AI system is programmed to perform particular tasks on the data that is fed to it. No AI system can go and find new kinds of data. No AI system even has a desire to do so. If a system is given a new kind of data — say, we feed all of Wikipedia’s text to a face-recognition AI system — it has no capability to produce meaningful outputs from that new kind of input.

Intro to Machine Learning course

August 1, 2021 Mindy McAdams

A couple of days ago, I wrote about Kaggle’s free introductory Python course. Then I started the next free course in the series: Intro to Machine Learning. The course consists of seven modules; the final module, like the last module in the Python course, shows you how to enter a Kaggle competition using the skills from the course.

The first module, “How Models Work,” begins with a simple decision tree, which is nice because (I think) everyone can grasp how that works, and how you add complexity to the tree to get more accurate answers. The dataset is housing data from Melbourne, Australia; it includes the type of housing unit, the number of bedrooms, and most important, the selling price (and other data too). The data have already been cleaned.

In the second module, we load the Python Pandas library and the Melbourne CSV file. We call one basic statistics function that is built into Pandas — describe() — and get a quick explanation of the output: count, mean, std (standard deviation), min, max, and the three quartiles: 25%, 50% (median), 75%.

When you do the exercise for the module, you can copy and paste the code from the lesson into the learner’s notebook.

The third module, “Your First Machine Learning Model,” introduces the Pandas columns attribute for the dataframe and shows us how to make a subset of column headings — thus excluding any data we don’t need to analyze. We use the dropna() method to eliminate rows that have missing data (this is not explained). Then we set the prediction target (y) — here it will be the Price column from the housing data. This should make sense to the learner, given the earlier illustration of the small decision tree.

y = df.Price

We use the previously created list of selected column headings (named features) to create X, the features of each house that will go into the decision tree model (such as the number of rooms, and the size of the lot).

X = df[features]

Then we build a model using Python’s scikit-learn library. Up to now, this will all be familiar to anyone who’s had an intro-to-Pandas course, particularly if the focus was data science or data journalism. I do like the list of steps given (building and using a model):

Define: What type of model will it be? A decision tree? Some other type of model? Some other parameters of the model type are specified too.
Fit: Capture patterns from provided data. This is the heart of modeling.
Predict: Just what it sounds like.
Evaluate: Determine how accurate the model’s predictions are. (List quoted from Kaggle course.)

Since fit() and predict() are commands in scikit-learn, it begins to look like machine learning is just a walk in the park! And since we are fitting and predicting on the same data, the predictions are perfect! Never fear, that bubble will burst in module 4, “Model Validation,” in which the standard practice of splitting your data into a training set and a test set is explained.

First, though, we learn about predictive accuracy. Out of all the various metrics for summarizing model quality, we will use one called Mean Absolute Error (MAE). This is explained nicely using the housing prices, which is what we are attempting to predict: If the house sold for $150,000 and we predicted it would sell for $100,000, then the error is $150,000 minus $100,000, or $50,000. The function for MAE sums up all the errors and returns the mean.

This is where the lesson says, “Uh-oh! We need to split our data!” We use scikit-learn’s train_test_split() method, and all is well.

MAE shows us our model is pretty much crap, though. In the fifth module, “Underfitting and Overfitting,” we get a good explanation of the title topic and learn how to limit the number of leaf nodes at the end of our decision tree — DecisionTreeRegressor(max_leaf_nodes).

After all that, our model’s predictions are still crap — because a decision tree model is “not very sophisticated by modern machine learning standards,” the module text drolly explains. That leads us to the sixth module, “Random Forests,” which is nice for two reasons: (1) The explanation of a random forest model should make sense to most learners who have worked through the previous modules; and (2) We get to see that using a different model from scikit-learn is as simple as changing

my_model = DecisionTreeRegressor(random_state=1)

my_model = RandomForestRegressor(random_state=1)

Overall I found this a helpful course, and I think a lot of beginners could benefit from taking it — depending on their prior level of understanding. I would assume at least a familiarity with datasets as CSV files and a bit more than beginner-level Python knowledge.

Free courses at Kaggle

July 30, 2021 Mindy McAdams

I recently found out that Kaggle has a set of free courses for learning AI skills.

The first course is an introduction to Python, and these are the course modules:

Hello, Python: A quick introduction to Python syntax, variable assignment, and numbers
Functions and Getting Help: Calling functions and defining our own, and using Python’s builtin documentation
Booleans and Conditionals: Using booleans for branching logic
Lists: Lists and the things you can do with them. Includes indexing, slicing and mutating
Loops and List Comprehensions: For and while loops, and a much-loved Python feature: list comprehensions
Strings and Dictionaries: Working with strings and dictionaries, two fundamental Python data types
Working with External Libraries: Imports, operator overloading, and survival tips for venturing into the world of external libraries

Even though I’m an intermediate Python coder, I skimmed all the materials and completed the seven problem sets to see how they are teaching Python. The problems were challenging but reasonable, but the module on functions is not going to suffice for anyone who has little prior experience with programming languages. I see this in a lot of so-called introductory materials — functions are glossed over with some ready-made examples, and then learners have no clue how returns work, or arguments, etc.

At the end of the course, the learner is encouraged to join a Kaggle competition using the Titanic passengers dataset. However, the learner is hardly prepared to analyze the Titanic data at this point, so really this is just an introduction to how to use files provided in a competition, name your notebook, save your work, and submit multiple attempts. The tutorial gives you all the code to run a basic model with the data, so it’s really more a demo than a tutorial.

My main interest is in the machine learning course, which I’ll begin looking at today.

What is a neural network and how does it work?

September 10, 2020June 3, 2021 Mindy McAdams

The most wonderful thing about YouTube is you can use it to learn just about anything.

One of the 10,000 annoying things about YouTube is finding a good, satisfying version of the lesson you want to learn can take hours of searching. This is especially true of videos about technical aspects of machine learning. Of course there are one- and two-hour recordings of course lectures by computer science professors. But I’ve been seeking out shorter videos with more animations and illustrations of concepts.

Understanding what a neural network is and how it processes data is necessary to demystifying machine learning. Data goes in, results come out — but in between is a “black box” consisting of code and hardware. It sort of works like a human brain, and yet, it really doesn’t.

So here at last is a painless, math-free video that walks us through a neural network. The particular example shown uses the MNIST dataset, which consists of 70,000 images of handwritten digits, 0–9. So the task being performed is the recognition of those digits. (This kind of system can be used to sort mail using postal codes, for example.)

What you’ll see is how the first layer (a vertical line of circles on the left side) represents the input. If each of the MNIST images is 28 pixels wide by 28 pixels high, then that first layer has to represent 784 pixels and each of their color values — which is a number. (One image is the input — only one at a time.)

The final vertical layer, all the way to right side, is the output of the neural network. In this example, the output tells us which digit was in the input — 0, 1, 2, etc. To see the value in this, go back to the mail-sorting idea. If a system can read postal codes, it recognizes several numbers and then transmits them to another system that “knows” which postal code goes to which geographical location. My letter gets sorted into the Florida bin and yours into the bin for your home.

In between the input and the output are the vertical “hidden” layers, and that’s where the real work gets done. In the video you’ll see that the number of circles — often called neurons, but they can also be called just units — in a hidden layer might well be less than the number of units in the input layer. The number of units in the output layer can also differ from the numbers in other layers.

When the video describes edge detection, you might recall an earlier post here.

Beautifully, during an animation, our teacher Grant Sanderson explains and shows that the weights exist not in or on the units (the “neurons”) but in fact in or on the connections between the units.

Okay, I lied a little. There is some math shown here. The weight assigned to the connection is multiplied by the value of the unit to the left. The results are all summed, for all left-side units, and that sum is assigned to the unit to the right (meaning the right side of that one connection).

The video bogs down just a bit between the Sigmoid squishification function and applying the bias, but all you really need to grasp is that the value of the right-side unit shows whether or not that little region of the image (in this case, it’s an image) has a significant difference. The math is there to determine if the color, the amount of color, is significant enough to count. And how much it should count.

I know — math, right?

But seriously, watch the video. It’s excellent.

“And that’s a lot to think about! With this hidden layer of 16 neurons, that’s a total of 784 times 16 weights, along with 16 biases. And all of that is just the connections from the first layer to the second.”
—Grant Sanderson, But what is a neural network? (video)

Sanderson doesn’t burden us with the details of the additional layers. Once you’ve seen the animations for that first step — from the input layer through the connections to the first hidden layer — you’ll have a real appreciation for what’s happening under the hood in a neural network.

In the final 6 minutes of this 19-minute video, you’ll also learn how the “learning” takes place in machine learning when a neural net is involved. All those weights and bias values? They are not determined by humans.

“Digging into what the weights and biases are doing is a good way to challenge your assumptions and really expose the full space of possible solutions.”
—Grant Sanderson, But what is a neural network? (video)

I confess it does get rather mathy at the end, but hang on through the parts that are beyond your personal math background and listen to what Sanderson is telling us. You can get a lot out of it even if the equation itself is like hieroglyphics to you.

The video content ends at 16:26, followed by the usual “subscribe to my channel” message. More info about Sanderson and his excellent videos is on his website, 3Blue1Brown.

Sorting out a degree in artificial intelligence

September 9, 2020 Mindy McAdams

Reading course descriptions and degree plans has helped me understand more about the fields of artificial intelligence and data science. I think some universities have whipped up a program in one of these hot fields of study just to put something on the books. It’s quite unfair to students if this is just a collection of existing courses and not a deliberate, well structured path to learning.

I came across this page from Northeastern University that attempts to explain the “difference” between artificial intelligence and machine learning. (I use those quotation marks because machine learning is a subset of artificial intelligence.) The university has two different master’s degree programs for artificial intelligence; neither one has “machine learning” in its name — but read on!

One of the two programs does not require a computer science undergraduate degree. It covers data science, robotics, and machine learning.

The other master’s program is for students who do have a background in computer science. It covers “robotic science and systems, natural language processing, machine learning, and special topics in artificial intelligence.”

I noticed that data science is in the program for those without a computer science background, while it’s not mentioned in the other program. This makes sense if we understand that data science and machine learning really go hand in hand nowadays. A data scientist likely will not develop any new machine learning systems, but she will almost certainly use machine learning to solve some problems. Training in statistics is necessary so that one can select the best algorithm for use in machining learning for solving a particular problem.

Graduates of the other program, with their prior experience in computer science, should be ready to break ground with new and original AI work. They are not going to analyze data for firms and organizations. Instead, they are going to develop new systems that handle data in new ways.

The distinction between these two degree programs highlights a point that perhaps a lot of people don’t yet understand: people (like journalists who have code experience) are training models — using machine learning systems through writing code to control them — and yet they are not people who create new machine learning systems.

Separately there are developers who create new AI software systems, and engineers who create new AI hardware systems. In other words, there are many different roles in the AI field.

Finally, there are so-called AI systems sold to banks and insurance companies, and many other types of firms, for which the people using the system do not write code at all. Using them requires data to be entered, and results are generated (such as whose insurance rates will go up next year). The workers who use these systems don’t write code any more than an accountant writes code. Moreover, they can’t explain how the system works — they need only know what goes in and what comes out.

Comment moderation as a machine learning case study

September 7, 2020September 7, 2020 Mindy McAdams

Continuing my summary of the lessons in Introduction to Machine Learning from the Google News Initiative, today I’m looking at Lesson 5 of 8, “Training your Machine Learning model.” Previous lessons were covered here and here.

Now we get into the real “how it works” details — but still without looking at any code or computer languages.

The “lesson” (actually just a text) covers a common case for news organizations: comment moderation. If you permit people to comment on articles on your site, machine learning can be used to identify offensive comments and flag them so that human editors can review them.

With supervised learning (one of three approaches included in machine learning; see previous post here), you need labeled data. In this case, that means complete comments — real ones — that have already been labeled by humans as offensive or not. You need an equally large number of both kinds of comments. Creating this dataset of comments is discussed more fully in the lesson.

You will also need to choose a machine learning algorithm. Comments are text, obviously, so you’ll select among the existing algorithms that process language (rather than those that handle images and video). There are many from which to choose. As the lesson comes from Google, it suggests you use a Google algorithm.

In all AI courses and training modules I’ve looked at, this step is boiled down to “Here, we’ll use this one,” without providing a comparison of the options available. This is something I would expect an experienced ML practitioner to be able to explain — why are they using X algorithm instead of Y algorithm for this particular job? Certainly there are reasons why one text-analysis algorithm might be better for analyzing comments on news articles than another one.

What is the algorithm doing? It is creating and refining a model. The more accurate the final model is, the better it will be at predicting whether a comment is offensive. Note that the model doesn’t actually know anything. It is a computer’s representation of a “world” of comments in which some — with particular features or attributes perceived in the training data — are rated as offensive, and others — which lack a sufficient quantity of those features or attributes — are rated as not likely to be offensive.

The lesson goes on to discuss false positives and false negatives, which are possibly unavoidable — but the fewer, the better. We especially want to eliminate false negatives, which are offensive comments not flagged by the system.

“The most common reason for bias creeping in is when your training data isn’t truly representative of the population that your model is making predictions on.”
—Lesson 6, Bias in Machine Learning

Lesson 6 in the course covers bias in machine learning. A quick way to understand how ML systems come to be biased is to consider the comment-moderation example above. What if the labeled data (real comments) included a lot of comments offensive to women — but all of the labels were created by a team of men, with no women on the team? Surely the men would miss some offensive comments that women team members would have caught. The training data are flawed because a significant number of comments are labeled incorrectly.

There’s a pretty good video attached to this lesson. It’s only 2.5 minutes, and it illustrates interaction bias, latent bias, and selection bias.

Lesson 6 also includes a list of questions you should ask to help you recognize potential bias in your dataset.

It was interesting to me that the lesson omits a discussion of how the accuracy of labels is really just as important as having representative data for training and testing in supervised learning. This issue is covered in ImageNet and labels for data, an earlier post here.

Examples of machine learning in journalism

September 4, 2020April 4, 2022 Mindy McAdams

Following on from yesterday’s post, today I looked at more lessons in Introduction to Machine Learning from the Google News Initiative. (Friday AI Fun posts will return next week.)

The separation of machine learning into three different approaches — supervised learning, unsupervised learning, and reinforcement learning — is standard (Lesson 3). In keeping with the course’s focus on journalism applications of ML, the example given for supervised learning is The Atlanta Journal-Constitution‘s deservedly famous investigative story about sex abuse of patients by doctors. Supervised learning was used to sort more than 100,000 disciplinary reports on doctors.

The example of unsupervised learning is one I hadn’t seen before. It’s an investigation of short-term rentals (such as Airbnb rentals) in Austin, Texas. The investigator used locality-sensitive hashing (LSH) to group property records in a set of about 1 million documents, looking for instances of tax evasion.

The main example given for reinforcement learning is AlphaGo (previously covered in this blog), but an example from The New York Times — How The New York Times Is Experimenting with Recommendation Algorithms — is also offered. Reinforcement learning is typically applied when a clear “reward” can be identified, which is why it’s useful in training an AI system to play a game (winning the game is a clear reward). It can also be used to train a physical robot to perform specified actions, such as pouring a liquid into a container without spilling any.

Also in Lesson 3, we find a very brief description of deep learning (it doesn’t mention layers and weights). and just a mention of neural networks.

“What you should retain from this lesson is fairly simple: Different problems require different solutions and different ML approaches to be tackled successfully.”
—Lesson 3, Different approaches to Machine Learning

Lesson 4, “How you can use Machine Learning,” might be the most useful in this set of eight lessons. Its content comes (with permission) from work done by Quartz AI Studio — specifically from the post How you’re feeling when machine learning might help, by the super-talented Jeremy B. Merrill.

The examples in this lesson are really good, so maybe you should just read it directly. You’ll learn about a variety of unusual stories that could only be told when journalists used machine learning to augment their reporting.

“Machine learning is not magic. You might even say that it can’t do anything you couldn’t do — if you just had a thousand tireless interns working for you.”
—Lesson 4, How you can use Machine Learning

(The Quartz AI Studio was created with a $250,000 grant from the Knight Foundation in 2018. For a year the group experimented, helped several news organizations produce great work, and ran a number of trainings for journalists. Then it was quietly disbanded in early 2020.)

Note (added April 4, 2022): The two links above to Quartz AI Studio content have been updated. The original domain, qz-dot-ai, was given up when, at renewal time, the price of all dot-ai domains had skyrocketed. Unfortunately, all the images have been lost, according to a personal communication from Merrill.

Google’s machine learning ‘course’ for journalists

September 3, 2020 Mindy McAdams

I couldn’t resist dipping into this free course from the Google News Initiative, and what I found surprised me: eight short lessons that are available as PDFs.

The good news: The lessons are journalism-focused, and they provide a painless introduction to the subject. The bad news: This is not really a course or a class at all — although there is one quiz at the end. And you can get a certificate, for what it’s worth.

There’s a lot here that many journalists might not be aware of, and that’s a plus. You get a brief, clear description of Reuters’ News Tracer and Lynx Insight tools, both used in-house to help journalists discover new stories using social media or other data (Lesson 1). A report I recall hearing about — how automated real-estate stories brought significant new subscription revenue to a Swedish news publisher — is included in a quick summary of “robot reporting” (also Lesson 1).

Lesson 2 helpfully explains what machine learning is without getting into technical operations of the systems that do the “learning.” They don’t get into what training a model entails, but they make clear that once the model exists, it is used to make predictions. The predictions are not like what some tarot-card reader tells you but rather probability-based results that the model is able to produce, based on its prior training.

Noting that machine learning is a subset of the wider field called artificial intelligence is, of course, accurate. What is inaccurate is the definition “specific applications that use data to train a model to perform a given task independently and learn from experience.” They left out Q-learning, a type of reinforcement learning (a subset of machine learning), which does not use a model. It’s okay that they left it out, but they shouldn’t imply that all machine learning requires a trained model.

The explosion of machine learning and AI in the past 10 years is explained nicely and concisely in Lesson 2. The lesson also touches on misconceptions and confusion surrounding AI:

“The lack of an officially agreed definition, the legacy of science-fiction, and a general low level of literacy on AI-related topics are all contributing factors.”
—Lesson 2, Is Machine Learning the same thing as AI?

I’ll be looking at Lessons 3 and 4 tomorrow.

Free courses in machine learning

September 1, 2020September 5, 2020 Mindy McAdams

Two days ago, I came upon this newly published course from FastAI: Practical Deep Learning for Coders. I actually stumbled across it via a video on YouTube, which I’ve watched now, and it made me feel optimistic about the course. I’m in the middle of the CS50 AI course from Harvard, though, so I need to hold off on the FastAI course for now.

The first video got me thinking.

First, they said (as many others have said) that Python is the main programming language used for machine learning today. (This makes me happy, as I know Python.) But I wonder whether there’s more to this claim than I’m aware of.

Second, they said PyTorch has superseded TensorFlow as the framework of choice for machine learning. They said PyTorch is “much easier to use and much more useful for researchers.”

“Within the last 12 months, the percentage of papers at major conferences that use PyTorch has gone from 20 percent to 80 percent and vice versa — those that use TensorFlow have gone from 80 percent to 20 percent.”
—Jeremy Howard, in the FastAI video “Lesson 1 – Deep Learning for Coders (2020)”

Note, I don’t know if this is true. But it caught my attention.

FastAI is a library “that sits on top of PyTorch,” they explain. They say it is “the most popular higher-level API for PyTorch,” and it removes a lot of the struggle necessary to get started with PyTorch.

This leads me back to the CS50 AI course. The CS50 phenomenon was documented in The New Yorker in July 2020. One insanely popular course, Introduction to Computer Science, has spawned multiple follow-on courses, including the seven-module course about the principles of artificial intelligence in which I am currently enrolled (not for credit).

In the various online forums and Facebook groups devoted to CS50, you can see a lot of people asking whether they need to take the intro course prior to starting the AI course. Some of them admit they have never programmed before. They know nothing about coding. But they think they might take an AI programming course as their very first computer science course.

This is what I was thinking about as the speakers in the FastAI video both praised how easy FastAI makes it to train a model and cautioned that machine learning is not a task for code newbies.

Training and testing a machine learning system — a system that will make predictions to be used in some industry, some social context, where human lives might be affected — should not be dependent on someone who learned how to do it in one online course.

Some other free online courses:

Amazon’s Machine Learning University
Google News Initiative: Introduction to Machine Learning
News Algorithms: The Impact of Automation and AI on Journalism (Knight Center for Journalism in the Americas)