How smart is ChatGPT? A test of Melanie Mitchell’s story.
Melanie Mitchell, in her popular book on AI, “Artificial Intelligence: A Guide for Thinking Humans” tells the following restaurant story in Chapter 11, “Words, and the Company they Keep”.
“A man went into a restaurant and ordered a hamburger, cooked rare. When it arrived, it was burned to a crisp. The waitress stopped by the man's table. "Is the burger okay?" she asked. "Oh, it's just great," the man said, pushing back his chair and storming out of the restaurant without paying. The waitress yelled after him, "Hey, what about the bill?" She shrugged her shoulders, muttering under her breath, "Why is he so bent out of shape?"
Melanie follows this story with a few questions and statements, that, in her opinion, could be useful to demonstrate that “…machines lack the detailed, interrelated concepts and commonsense knowledge that even a four-year-old child brings to understanding language.”
As she writes, “Its’ mind-boggling to think of all the background knowledge the program would need in order to give confident answers to basic questions about the story”
Melanie mentions the few times in the book the propensity of some to declare the difficulty of machines accomplishing a seemingly impossible task often to be proven wrong shortly thereafter. Given that fact, I thought it interesting to see if what she herself considered “mind-boggling” has now also come to pass. The edition of her book that I am referring to is from 2020, so a few years earlier than the ChatGPT phenomenon, but still, I think, within a period that could safely pass for “shortly thereafter.”
Let’s see what ChatGPT makes of this story, answering and responding to Melanie’s points in the book.
Melaine: Did the man eat the hamburger? To answers this confidently, a hypothetical program would need to know that hamburgers belong to the category “food,” and that food could be eaten.
ChatGPT: Based on the story, the man did not eat the burger.
Melaine: Why did the waitress say, “What about the bill?”
ChatGPT: The waitress said, "What about the bill?" because the man was leaving without paying for his order.
Melanie: Did the man leave the waitress a tip?
ChatGPT: No, the man did not leave a tip.
Melanie: Did the waitress know that the man was angry?
ChatGPT: Yes, the waitress likely knew the man was angry because he stormed out of the restaurant after expressing sarcasm about the burger's quality.
Melanie: Did the waitress know why the man left the restaurant?
ChatGPT: Yes, the waitress likely knew why the man left the restaurant, given his sarcastic response to her question about the burger's quality and his subsequent action of storming out.
Melaine: A program would need to know that that when a person orders a hamburger “cooked rare,” the person generally doesn’t want to eat it if it has been “burned to a crisp.”
ChatGPT: When a person orders a hamburger "cooked rare," it indicates a preference for the burger to be lightly cooked, with the inside still pink and warm. A burger that is "burned to a crisp" is significantly overcooked according to this preference, and it would generally be unsatisfactory to someone who requested their burger rare.
Melaine: The program should recognize that when the man says, “Oh, it’s just great,” he is being sarcastic and that “it” refers to the burger.
ChatGPT: When the man says, "Oh, it's just great," given the context that the burger was burned contrary to his order for it rare, he is using sarcasm. "It" refers to the burger, indicating his dissatisfaction with how it was prepared.
Melaine: The program would need to surmise that if you “storm” out of a restaurant without paying, it’s likely that you haven’t eaten your meal.
ChatGPT: If someone "storms" out of a restaurant without paying, it's likely that they haven't eaten their meal, especially in a context indicating dissatisfaction.
She closes off with a question, “Can deep learning, along with big data, produce machines that can flexibly and reliably deal with human language.” To which, ChatGPT ansewers, “Yes, deep learning combined with big data has significantly advanced the capability of machines to deal with human language flexibly and reliably. “
I guess we can now safely conclude that what was mind-boggling in 2020 has now come to pass?