The companies that are making AI a hot investment

Sometimes I read something that is like a voice out of my own head:

“Artificial intelligence is a buzzword increasingly being used by companies around the world that seek to project themselves at the forefront of cutting-edge research … As the word loses its meaning, it is important for investors to understand what artificial intelligence is and what companies stand to gain from breakthroughs in the new technology.”

Yahoo! Finance, April 12, 2021

That comes from an article titled “10 Best Artificial Intelligence Stocks to Buy for 2021” (link above) but it’s more than just a list of stock tips. It points out that “technology firms with social media services” (e.g. Facebook) are hot because they have the massive datasets that power machine learning about consumers. Companies that make super-fast computer hardware — particularly graphical processing units (GPUs) that crunch through that data — are also good bets (although I’ve heard about growing hardware shortages due to the pandemic).

The article’s author refers to hedge-fund investments as an indicator, which might make me leery about investing my own hard-earned cash, but the list of companies still interested me. Along with hardware manufacturers such as Micron Technology and Nvidia; Amazon, which is valuable for more than only its growing AI expertise; and Alphabet Inc., the parent of Google and DeepMind — the list also includes:

  • Adobe, which is “integrating data-based learning into most of its software through Adobe Sensei, a tool that uses artificial intelligence to improve user experiences across a wide range of Adobe products.”
  • Facebook — this is Yahoo! FInance’s No. 1 pick, and with its deep pockets, Facebook is certainly able to acquire some of the best research minds in AI today. Its efforts are grouped under the Facebook AI label, and the breadth of its work is visible on this page.
  • IBM — this is a recommendation I would argue with. IBM talks a big game in AI, but its failures with IBM Watson Health make me skeptical about its strategies overall.
  • Microsoft, which “has a separate artificial intelligence unit called Microsoft AI that helps users, organizations, and governments across the world with machine learning, data analytics, robotics, and internet of things products.” Just this week, Microsoft to announced a $16 billion cash deal to buy Nuance, which develops AI software including speech-recognition products (Dragon is one). Microsoft pointed to Nuance’s position in the healthcare market as a primary reason for the acquisition.
  • Pinterest, because it is using AI to sort and categorize the millions of images shared by its users and also to “tailor the experiences” of users. Note, news organizations such as The New York Times are also using AI to determine how content is presented to users.
  • Salesforce.com, which “provides customer relationship management services and other enterprise solutions on market automation, data analytics, and application development.” The company markets its AI products under the Einstein brand — see AI use cases from the company. Salesforce acquired Slack Technologies last year.

Notably absent from the list is Apple (although maybe not a great investment, due to its high valuation), which is no newcomer to incorporating AI into its products. Critics might pooh-pooh Apple’s AI clout, but machine learning has been integral to the iPhone, iPad, and Apple Watch for years. Ars Technica published an excellent article about this in mid-2020.

Another absence is the assorted promising startups — particularly those in the climate arena and those founded by alumni of DeepMind, which to me is the most fantastic incubator of AI talent (see AlphaFold) outside the top universities. Just this week, Google put money into one of those startups — founded by a former research engineer at DeepMind, and “focused on reducing greenhouse gas emissions.”

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.

.

Loving the spaCy tutorial for NLP

I got my first look at spaCy, a Python library for natural language processing, near the end of 2019. I wanted to learn it but had too many other things to do. Fast-forward to now, almost 14 months into the pandemic, and I recently stumbled across spaCy’s own tutorial for learning to use the library.

The interactive tutorial includes videos, slides, and code exercises, and there is a GitHub repo. It is available in English, Deutsch, Español, Français, Português, 日本語, and 中文. Today I completed chapter 2. If you already know Python at, say, an intermediate level, check it out!

Screenshot from Jupyter Notebook showing named entities
Trying out spaCy’s displaCy module and named entities.

In chapter 1 (there are four chapters), I got a handle on part-of-speech tags, syntactic dependencies, and named entities. I learned that we can search on these, and also on words (tokens) related to combinations that we define. I’ve known about large-scale document searches (where a huge collection of documents is searched programmatically, usually to extract the most meaningful docs for some purpose — like a journalism investigation), and now I was getting a much better idea of how such searches can be designed.

SpaCy provides “pre-trained model packages,” meaning someone else has already done the hard work of machine learning/training to generate word vectors. There are packages of various sizes and in various languages. Loading a model provides various features (the bigger the model, the more features).

I think I was hooked as soon as I saw this and realized you could ask for all the MONEY entities, or all the ORG entities, in a document and evaluate them:

An example from chapter 1 in the spaCy tutorial.

Then (still in chapter 1) I learned that I can easily define my own entities if the model doesn’t recognize the ones I need to find. I learned that if I don’t know what GPE is, I can enter spacy.explain("GPE") and spaCy will return 'Countries, cities, states' — sweet!

Then I learned about rule-based matching, and I thought: “Regular expressions, buh-bye!”

Chapter 1 didn’t really get deeply into lemmatization, but it offered this:

Lemmatization groups all forms of a word together so they can be analyzed as one item.

That was just chapter 1! Chapter 2 went further into creating your own named entities and using parts of speech as part of your search criteria. For example, if you want to find all instances where a particular entity (say, a city) is followed by a verb — any verb — you can do that. Or any part of speech. You can construct a complex pattern, mixing specific words, parts of speech, and selected types of entities. The pattern can include as many tokens as you want. (If you’re familiar with regex — all the regex things are available.)

You can determine whether phrases or sentences are similar to each other (although imperfectly).

I’m not entirely sure how I would use these, but I’m sure they’re good for something:

  • .root — the token that decides the category of the phrase
  • .head — the syntactic “parent” that governs the phrase

There is an exercise in which I matched country names and their root head token (span.root.head), which gave me a bit of a clue as to how useful that might be in some circumstances.

Example of use of the root head token on a 700-word text.

Also in chapter 2, I learned how to use an imported JSON file to add 240 country names as GPE entities — obviously, the imported terms could be any kind of entity.

So, I’m feeling very excited about spaCy! Halfway through the tutorial!

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.

Image recognition in medicine: MS subtypes

Machine learning systems for image recognition aren’t always perfect — and neither are AI systems marketed for medical use, whether they use image recognition or not. But here’s an example of image recognition used in a medical context where the system appears to have succeeded at something significant — and it’s something humans can’t do, or at least can’t do well.

“Researchers used the AI tool Subtype and Stage Inference (SuStaIn) to scan the MRI brain scans of 6,322 patients with MS, letting SuStaIn train itself unsupervised. The AI identified 3 previously unknown patterns …” (Pharmacy Times). The model was then tested on MRIs from “a separate independent cohort of 3,068 patients” and successfully identified the three new MS subtypes in them.

Subtype and Stage Inference (SuStaIn) was introduced in this 2018 paper. It is an “unsupervised machine-learning technique that identifies population subgroups with common patterns of disease progression” using MRI images. The original researchers were studying dementia.

Why does it matter? Identifying the subtype of the disease multiple sclerosis (MS) enables doctors to pursue different treatments for them, which might lead to better results for patients.

“While further clinical studies are needed, there was a clear difference, by subtype, in patients’ response to different treatments and in accumulation of disability over time. This is an important step towards predicting individual responses to therapies,” said Dr. Arman Eshaghi, the lead researcher (EurekAlert).

Sources: Artificial Intelligence Weekly newsletter, from The Wall Street Journal; Pharmacy Times; EurekAlert.

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.

.

Journalists reporting about AI

In the latest JournalismAI newsletter, a list of recommendations called “Reporting on AI Effectively” shares wisdom from several journalists who are reporting about a range of artificial intelligence and machine learning topics. The advice is grouped under these headings:

  • Build a solid foundation
  • Beat the hype
  • Complicate the narrative
  • Be compassionate, but embrace critical thinking

Karen Hao, senior AI editor at MIT Technology Review — whose articles I read all the time! — points out that to really educate yourself about AI, you’re going to need to read some of the research papers in the field. She also recommends YouTube as a resource for learning about AI — and I have to agree. I’ve never used YouTube so much to learn about a topic before I began studying AI.

The post also offers good advice about questions a reporter should ask about AI research and new developments int the field.

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.

.