In August, researchers from the Allen Institute for Artificial Intelligence, a lab based in Seattle, unveiled an English test for computers . It examined whether machines could complete sentences like this one:
On stage, a woman takes a seat at the piano. She
a) sits on a bench as her sister plays with the doll.
b) smiles with someone as the music plays.
c) is in the crowd, watching the dancers.
d) nervously sets her fingers on the keys.
For you, that would be an easy question. But for a computer, it was pretty hard. While humans answered more than 88 percent of the test questions correctly, the lab’s A.I. systems hovered around 60 percent. Among experts — those who know just how difficult it is to build systems that understand natural language — that was an impressive number.
Then, two months later, a team of Google researchers unveiled a system called Bert . Its improved technology answered those questions just as well as humans did — and it was not even designed to take the test.
Bert’s arrival punctuated a significant development in artificial intelligence. Over the last several months, researchers have shown that computer systems can learn the vagaries of language in general ways and then apply what they have learned to a variety of specific tasks.
Built in quick succession by several independent research organizations, including Google and the Allen Institute, these systems could improve technology as diverse as digital assistants like Alexa and Google Home as well as software that automatically analyzes documents inside law firms, hospitals, banks and other businesses.
“Each time we build new ways of doing something close to human level, it allows us to automate or augment human labor,” said Jeremy Howard, the founder of Fast.ai, an independent lab based in San Francisco that is among those at the forefront of this research. “This can make life easier for a lawyer or a paralegal. But it can also help with medicine.”
It may even lead to technology that can — finally — carry on a decent conversation .
But there is a downside: On social media services like Twitter, this new research could also lead to more convincing bots designed to fool us into thinking they are human, Mr. Howard said.
Researchers have already shown that rapidly improving A.I. techniques can facilitate the creation of fake images that look real . As these kinds of technologies move into the language field as well, Mr. Howard said, we may need to be more skeptical than ever about what we encounter online.
These new language systems learn by analyzing millions of sentences written by humans. A system built by OpenAI , a lab based in San Francisco, analyzed thousands of self-published books, including romance novels, science fiction and more. Google’s Bert analyzed these same books plus the length and breadth of Wikipedia.
Each system learned a particular skill by analyzing all that text. OpenAI’s technology learned to guess the next word in a sentence. Bert learned to guess missing words anywhere in a sentence. But in mastering these specific tasks, they also learned about how language is pieced together.
If Bert can guess the missing words in millions of sentences (such as “the man walked into a store and bought a ____ of milk”), it can also understand many of the fundamental relationships between words in the English language, said Jacob Devlin, the Google researcher who oversaw the creation of Bert. (Bert is short for Bidirectional Encoder Representations from Transformers.)
The system can apply this knowledge to other tasks. If researchers provide Bert with a bunch of questions and their answers, it learns to answer other questions on its own. Then, if they feed it news headlines that describe the same event, it learns to recognize when two sentences are similar. Usually, machines can recognize only an exact match.
Bert can handle the “common sense” test from the Allen Institute. It can also handle a reading comprehension test where it answers questions about encyclopedia articles. What is oxygen? What is precipitation? In another test, it can judge the sentiment of a movie review. Is the review positive or negative?
This kind of technology is “a step toward a lot of still-faraway goals in A.I., like technologies that can summarize and synthesize big, messy collections of information to help people make important decisions,” said Sam Bowman, a professor at New York University who specializes in natural language research.
In the weeks after the release of OpenAI’s system, outside researchers applied it to conversation. An independent group of researchers used OpenAI’s technology to create a system that leads a competition to build the best chatbot that was organized by several top labs, including the Facebook AI Lab . And this month, Google “open sourced ” its Bert system, so others can apply it to additional tasks. Mr. Devlin and his colleagues have already trained it in 102 languages.
Sebastian Ruder, a researcher based in Ireland who collaborates with Fast.ai, sees the arrival of systems like Bert as a “wake-up call” for him and other A.I. researchers because they had assumed language technology had hit a ceiling. “There is so much untapped potential,” he said.
The complex mathematical systems behind this technology are called neural networks . In recent years, this type of machine learning has accelerated progress in subjects as varied as face recognition technology and driverless cars . Researchers call this “deep learning.”
Bert succeeded in part because it leaned on enormous amounts of computer processing power that was not available to neural networks in years past. It analyzed all those Wikipedia articles over the course of several days using dozens of computer processors built by Google specifically for training neural networks .
The ideas that drive Bert have been around for years, but they started to work because modern hardware could juggle much larger amounts of data, Mr. Devlin said.
Like Google, dozens of other companies are now building chips specifically for this kind of machine learning, and many believe the influx of this extra processing power will continue to accelerate the progress of a wide range of A.I. technologies, including, most notably, natural language applications.
“Bert is a first thrust in that direction,” said Jeff Dean, who oversees Google’s artificial intelligence work. “But actually not all that big in terms of where we want to go.” Mr. Dean believes that ever larger amounts of processing power will lead to machines that can better juggle natural language.
But there is reason for skepticism that this technology can keep improving quickly because researchers tend to focus on the tasks they can make progress on and avoid the ones they can’t, said Gary Marcus, a New York University psychology professor who has long questioned the effectiveness of neural networks. “These systems are still a really long way from truly understanding running prose,” he said.
Oren Etzioni, chief executive of the Allen Institute for Artificial Intelligence, another prominent voice who has pushed for research that extends beyond neural networks, made the same point.
Though Bert passed the lab’s common-sense test, he said, machines are still a long way from an artificial version of a human’s common sense. But like other researchers in this field, he believes the trajectory of natural language research has changed. This is a moment of “explosive progress,” he said.