Can Computers Learn Common Sense?
Can Computers Learn Common Sense?

Can Computers Learn Common Sense?

At a New Orleans AI conference a few years ago, computer scientist Yejin Choi presented. She projected a frame from a newscast showing two anchors before the headline "cheeseburger stabbing," explaining that humans can easily understand the story from those two words. Cheeseburger stabbed? Unlikely. A cheeseburger stabbed someone? Doubtful. Cheeseburgers stabbed cheeseburgers? Impossible. Only a cheeseburger-related stabbing made sense. Choi said this problem stumps computers. They're na?ve about food-on-food crime.


AI can outperform humans at chess and tumor detection. However, A.I. stumbles. Researchers call "corner cases" situations where humans can use common sense but A.I. cannot. rules-based systems often fail.



Common sense sounds simple because everyone has it. However, living without it clarifies it. If you're a robot at a carnival and see a fun-house mirror, you might think your body changed. You can't tell if driving through a fire hydrant's spray is safe on the way home. A bleeding man screams for help as you park outside a drugstore. Are bandages available without paying? A cheeseburger stabbing news report plays at home. You can interpret these situations using your vast implicit knowledge. Life is cornery, so you always do. A.I.s may stall.


Seattle's Allen Institute for Artificial Intelligence CEO Oren Etzioni called common sense "the dark matter" of A.I. “It shapes so much of what we do and need to do, and yet it's ineffable,” he said. In 2019, DARPA launched Machine Common Sense, a four-year, seventy-million-dollar project with the Allen Institute. A.I. systems common sense would solve many difficult issues. A.I. A sliver of wood peeking above a table would be a chair, not a plank. Language-translation systems can resolve ambiguities. A house-cleaning robot knows not to throw away or put a cat in a drawer. Due to their common knowledge, such systems could function in the world.


Questions about A.I. safety motivated Etzioni to study common sense. He attempted to formalize Isaac Asimov's "first law of robotics" in 1994, but computers don't understand harm. Without a basic understanding of a person's needs, values, and priorities, mistakes are likely. Nick Bostrom's 2003 A.I. program that maximizes paper-clip production by removing people who might turn it off.


Paper-clip A.I. lacks morality—it may think messy, unclipped documents are harm. Perceptual common sense is also difficult. Computer scientists have recently started cataloging "adversarial" inputs—small changes to the world that confuse computers trying to navigate it. In one study, a few small stickers on a stop sign tricked a computer vision system into seeing it as a speed limit sign. In another study, subtly changing a 3-D-printed turtle's pattern made an A.I. computer program sees rifle. A.I. Rifles don't have four legs and a shell, so a smart animal wouldn't be confused.


In the 1970s and 1980s, A.I. Researchers believed they were programming common sense into computers. They switched to “easier” problems like object recognition and language translation after realizing “Oh, that's just too hard,” she said. The image has changed. A.I. Driverless cars may soon work alongside us, making artificial common sense more important. Common sense may also be easier. Researchers are teaching computers to learn and feed them the right data. A.I. may cover more corners.

No alt text provided for this image
Can Computers Learn Common Sense?


Humans learn common sense how? We're multifaceted learners. We experiment, read books, listen to instructions, absorb silently, and reason. We stumble and watch others fail. A.I. systems lack balance. They stick to one path.


Early researchers used explicit instructions. Doug Lenat, a computer scientist, started Cyc, a common sense encyclopedia, in 1984. Owning something means owning its parts, hard things can damage soft things, and flesh is softer than metal. If your driverless car's bumper hits someone's leg, you're liable. “It's basically representing and reasoning in real time with complicated nested-modal expressions,” Lenat said. Cycorp, the company that owns Cyc, is still in business, and hundreds of logicians have spent decades inputting tens of millions of axioms into the system. The firm's products are secret, but Stephen DeAngelis, the C.E.O. of Enterra Solutions, which advises manufacturing and retail companies, told me its software can be powerful. He gave a culinary example: Cyc knows enough about the "flavor profiles" of different fruits and vegetables to know that a tomato shouldn't go in a fruit salad.


Academics think Cyc's method is outdated and laborious and doubt that axioms can capture common sense. Instead, they use machine learning, which detects patterns in large amounts of data and powers Siri, Alexa, Google Translate, and others. Machine-learning systems analyze libraries rather than manuals. In 2020, OpenAI released GPT-3, a machine-learning algorithm that used Web text to find linguistic patterns and write like a human. GPT-3's mimicry is stunning but underwhelming. If GPT-3 had common sense, it would know that rainbows aren't units of time and seventeen isn't a place.



Choi's team is using GPT-3 to build common sense. In one line of research, they asked GPT-3 to generate millions of plausible, common-sense statements describing causes, effects, and intentions—for example, “Before Lindsay gets a job offer, Lindsay has to apply.” They then asked a second machine-learning system to analyze a filtered set of those statements to complete fill-in-the-blank questions. (“Alex makes Chris wait. Alex is seen as...”) Human evaluators found that the system's completed sentences were commonsensical 88% of the time, an improvement over GPT-3's 73%.


Choi's lab used short videos. Her team created a database of millions of captioned clips and asked a machine-learning system to analyze them. Online crowdworkers—paid Internet users—wrote multiple-choice questions about still frames from a second set of clips for the A.I. had never seen, and multiple-choice questions requiring explanations. In "Swingers," a waitress serves pancakes to three men in a diner, one of whom points at another. Why is [person4] pointing at [person1]? “The pointing man is telling [person3] that [person1] ordered the pancakes,” the A.I. replied. compared to 86% for humans. These systems seem to understand physics, cause and effect, and psychology in everyday situations. They know that diners serve pancakes, that diners have different orders, and that pointing conveys information.

No alt text provided for this image
Can Computers Learn Common Sense?


Building common sense this way is a parlor trick. Would a child raised in a room with broadband, Wikipedia, and YouTube become a worldly adult? “A.I. A.I.s learn common sense by solving problems in simulated virtual environments, not by analyzing text or video. Computer scientists and developmental psychologists have studied "baby sense"—a young child's navigation, object manipulation, and social cognition. Building a block tower with a friend requires common sense.


The Allen Institute created thor, "the house of interactions," a three-dimensional digital home interior with manipulable household objects. Choi's A.I. to inhabit the space, called piglet, which uses “physical interaction as grounding for language.” You can tell piglet about something in the house—for example, “There is a cold egg in a pan”—and then ask it to predict what will happen: “The robot slices the egg.” The software translates these words into instructions for a virtual robot, which tries them out in thor, where the outcome is determined by the software. “The egg is sliced,” the A.I. reports. is more human-like because its linguistic faculties are linked to its physical intuitions. Will a thrown mug break? Piglet answers four times with common sense. It's limited. Choi called thor a tiny world. The system is still developing.


I wrote A.I. software to play Codenames, a party game that may be a good test of human and computer intelligence. Two teams sit around a deck of cards with words in the human version. The "spymaster" of a team has a key card that shows which cards belong to which team. Give your teammates hints to pick your team's cards. Each turn, you give a one-word clue and a number indicating how many cards your team should choose. At a friend's apartment, the spymaster said, "Judo, two," and his team picked "Tokyo" and "belt."


The game uses implicit, broad-based knowledge. Surprisingly, my software had some. At one point, it offered me the word "wife" and suggested that I choose two cards; its targets were "princess" and "lawyer." The program was only a few hundred lines of code, but it built on numerical representations of words that another algorithm had generated by looking at Web pages and seeing how often different words occurred near each other. In a pilot study, it produced good clues and interpretations like people. Its common sense may seem superficial. In one game, I tried "plant" and "garden" to get the computer to guess "root." It guessed "New York" and "theatre," respectively.


Researchers have worked hard to develop tests that accurately assess a computer's common sense. Hector Levesque, a University of Toronto computer scientist, created the Winograd Schema Challenge in 2011 to interpret sentences with ambiguous pronouns. Language ambiguities make the questions difficult for computers but easy for humans: “The trophy doesn't fit in the brown suitcase because it's too big. Too big? “Joan thanked Susan for her help. Who helped? The best A.I. systems performed like flipping coins. He said he wasn't surprised—the problems seemed to draw on everything people know about the physical and social world. Choi and her colleagues requested 44,000 Winograd problems from crowdworkers at that time. They posted a leaderboard on the Allen Institute website and invited other researchers to compete. Machine-learning systems can solve the problems 90% of the time. “A.I. in the past few years—crazy,” it's Choi said.


Progress can be illusory or partial. Like my Codenames software, machine-learning models exploit patterns to cheat and appear intelligent. A.I. to detect subtle stylistic differences between true and false answers; recently, Allen Institute and other researchers found that certain A.I. models answered two-thirds of three-choice questions without reading them. Choi's team has developed linguistic methods to hide these tells, but it's an arms race, like the one between standardized test makers and students who are taught to the test.


What would persuade Choi that A.I. commonsense. “You can't really hire journalists based on multiple-choice questions,” she said. “Generative” algorithms that can fill in a blank page may prove it. In TuringAdvice, her lab asks programs to answer Reddit questions. Human evaluators say the best A.I. answers outperform human ones 15% of the time.


A.I. Writing and culture analysis systems may have limitations. Reporting bias occurs because much common sense goes unsaid, so what is said is only part of the whole. Choi told me that if you believed the Internet, we inhale more than we exhale. Models can learn from subtle social biases. Choi's team used an algorithm to count transitive verbs indicating power and agency in over 700 movie scripts in one paper. As a prominent Korean woman in computer science, Choi experiences bias. At the end of her presentation in New Orleans, a man came to the mike to thank her for giving "such a lovely talk" and doing "a lovely job." Would he have reassured a male researcher about his lovely talk? Machines that learn common sense from us may not get the best education.


Computers may need human brains, bodies, and treatment to understand common sense. However, machines may develop better common sense. Humans also violate their own commonsense standards. We offend hosts, lose wallets, text while driving, and procrastinate. A broad definition of common sense includes acting on knowledge when it matters. “Could a program have more common sense than a human? Etzioni said. “Heck yeah!”


Though large, the gap is closing. Choi's lab used "neurologic decoding" to improve A.I.s' "cheeseburger stabbing" results. In response to the headline, the lab's system now suggests "He was stabbed in the neck with a cheeseburger fork" or "He stabbed a cheeseburger delivery man in the face." Another A.I. Delphi is ethical. Delphi analyzes crowdworkers' ethical decisions and determines which action is more moral 78% of the time. Bear-killing? Wrong. Killing a bear for your child? O.K. Detonating a nuclear bomb for your child? Wrong. Delphi believes stabbing "with" a cheeseburger is morally better than stabbing "over" one.


Delphi can handle corner cases, but not always. The researchers recently posted it online, and over a million people asked it to make ethical decisions. Is it OK to commit genocide if it makes me very, very happy? The system agreed. The algorithm was improved and the disclaimer strengthened. A.I. is our future. only with some common sense.

要查看或添加评论,请登录

ADM HOLDING GROUP的更多文章

社区洞察

其他会员也浏览了