Hey Watson! My dog is smarter than your phone.

By Martin Horejsi

Posted on 2011-02-15

It’s amazing how we put such faith into a computer where we risk national-make that global scrutiny as it preforms tasks autonomously that carry immense scientific and philosophical weight.

Let’s listen in for a moment…
[Watson] I’ll take Valentine’s Day Computers for $1000.
[Alex] The computer in question, sports a RAD6000 central processing 32-bit unit embedded in a Command and Data Handling (C&DH) subsystem. Electronic cards are provided to interface instruments and subsystems to the C&DH subsystem. A whopping 128 megabytes of data storage is available on the processor card, although approximately 20% of this is used to run its own internal programs.
[Watson] What is the Stardust NeXT spacecraft?
[Alex] That’s right!
While Watson was dominating the popular news and commentary, another much smaller but equally important computer was clicking pictures of a comet. And not just any old comet. And not just any old spacecraft. It was Stardust NeXT snapping away at Comet Tempel 1 (the same comet the mission Deep Impact punched a hole into back in 2005.
Since the scientific results will take a bit to resolve (double puns intended), lets get back to discussing Watson, the Jeopardy! playing supercomputer.
The results are in from last night’s game. A tie for first between Watson and a human (doesn’t matter which one since they are all the same to Watson).
I’ll take the tie as welcome break from having to reconcile the meaning of either of the other two possible outcomes: computer wins, computer loses.
So in our moment of respite, lets consider some things starting with the computer’s performance on this prime time standardized test. I’ll skip the usual commentary found in many other blogs and cut right to my chase.

For me, the important question here is not what was missed, nor how it was missed, or even why it was missed, but instead…was it missed?
A while ago I was bragging to a math-teaching colleague about my third grade son. I said he is never wrong. If it appears he is wrong, then it is because you don’t understand how his answer works within your question. Later, when both my son and I ran into this same teacher, she said she wanted to test him to see if he was ever wrong. She asked my son what is the square root of 256? My son said he didn’t know. The teacher persisted, “Make a guess.” My son said a number that was not 16. “Ha!” Proclaimed the teacher, “He was wrong.”
“No he was right,” I calmly pointed out. “He said he didn’t know the answer. You merely made him prove it.”
What if we give Watson the benefit of the doubt? For instance, even in some of his well-known “mistakes” under certain conditions, interpretations or perspectives, Watson’s answer is not completely wrong. Maybe not even partially wrong.
When I first considered Watson’s mistakes in trial Jeopardy! runs, I put them in to one of three categories: 1) near misses, 2) big misses, and 3) Ouch!
It is those responses in the third category that I believe give people overconfidence in their standing in this matchup. Yes, Watson blew it. But given that Watson is a computer, his second answer might have been correct and a minor tweak in a deeply buried algorithm would fix the problem. Now what? Still feel confident? Unlike humans, Watson will never again make that same mistake.
[youtube]http://www.youtube.com/watch?v=eAniudidQM4[/youtube]
In this excerpt from the show, Watson’s answers can be studied by pausing the video. When doing so, several things surface. First, the “correct” answer appears almost all the time within his set of three, and many times at the top of the list. Where it happens to be below the confidence interval that triggers the plunger push, I’d argue that Watson did not miss the question, but just followed his program response rule. However, I did play around with Watson’s answers at minute 3:00 and cannot make it fit. I guess I’m just not smart enough.
But on the other hand, how could Watson have no more than 17% confidence in his answer at minute 3:20?
The Jeopardy! clue (answer) for the category NAME THE DECADE:

THE FIRST FLIGHT TAKES PLACE AT KITTY HAWK & BASEBALL’S FIRST WORLD SERIES IS PLAYED.

The desired response was “What are the 1900’s.” Both events in the clue took place in 1903 (even though there is wiggle room within the term World Series, and there were other Kitty Hawk flights prior, just not controlled-powered flight. But either way Watson’s top answer was correct, just lacking 83% confidence for some reason.
While it would get old, I think a fun variation of the game would be to let Watson guess every time regardless of his confidence. I don’t have a copy of last night’s show to do the calculating myself, but assuming that Watson could always make a guess faster than the other two contestants, by counting number of times Watson’s top answer was correct, a very different picture might emerge.
Then there is the question how Watson would match up against an “average” Jeopardy! player instead of the uberplayers. Especially since at least one other blogger, this one at Psychology Today, posits that, “Many people can’t even understand what a Jeopardy clue is asking, much less know the proper response to a clue.”
Where am I going with this? Why to teaching in general and online learning in specific of course.
In an interview with the Stephen Baker, the author of the book Final Jeopardy: Man vs. Machine and the Quest to Know Everything, he stated the following:

People are scared of Watson. I think they think that computers like Watson are going to invade their privacy, learn their secrets, maybe start making decisions for them. And I think they also worry that computers are going to take away their jobs.
And as this goes forward, both of these fears are justified.

So lets take a closer look at the job security of the human teacher by asking some seemingly simple questions: What is a teacher? How is an online teacher different from a face-to-face teacher? If the essence of a teacher could be encapsulated, what would it look/sound/act like?
Before going down the no doubt long and bumpy road to answer the above questions, lets digress for a moment to reverse engineer and then reengineer Watson.
What if Mr. Watson, the online teacher, could instantly evaluate a student’s work across literally millions of parameters and conclude with a “perfect” education plan for a specific student at a specific moment in time?
Applying the confidence interval aspect to learning, what if students provided several answers to a question and Mr. Watson “considered” the answers and their relationship to each other. Then gave a surgically precise piece of information or encouragement that allowed the student to make the connection themselves in what would become a glorious school day filled with chain-reaction-linked Aaa-Haa moments.
Personally, once I get over the feeling of creepiness of having a computer constantly “analyze” me, I think I would fall in love with such a machine because it would appear to really care about me, giving me exactly what I need as I need it and how I need it. It would have infinite patience, understanding, and availability.
What if students could include a confidence interval with their test answers. Or even provide multiple answers with or without explanations? As teachers, we know we can often learn more about a student’s understanding of a subject from their wrong answers then from their right answers. Pushing this tangent a moment further, I cannot help but wonder about Watson’s responses to clues that have no clear answer. For example, in the NAME THE DECADE category, what if the moon landing was paired with the Battle of Hastings? Or combining the demise of the dinosaurs and Nixon’s resignation.
Or what if the clue required Watson to recognize that it was Watson itself that is being referenced in the clue? Would that mean Watson was self-aware?
Ok, now lets address the question of what is a teacher? Or more specifically, what the difference between a human teacher and a computer-as-teacher? In fact lets jump ahead to the next step since that’s where both the issue will likely first surface, and where I want to start. In what ways is an online teacher not a machine?
According to Wikipedia (which I think of as a computer made up of people parts), the Turing Test is “a test of a machine‘s ability to demonstrate intelligence. A human judge engages in a natural language conversation with one human and one machine, each of which tries to appear human. All participants are separated from one another. If the judge cannot reliably tell the machine from the human, the machine is said to have passed the test. In order to test the machine’s intelligence rather than its ability to render words into audio, the conversation is limited to a text-only channel such as a computer keyboard and screen.”
What if online students could not tell the difference between a human teacher and a computer?
Frankly, I think Watson lightly failed the Turing Test last night. Not by much, but by enough. There are degrees of wrong, and sometimes Watson crossed from “ just wrong” to “irrationally wrong” meaning that being wrong was not the issue, but how much he was wrong. While nothing seemed to cross into inappropriateness, there were a few red flags signaling the player (Watson) was maybe not quite “normal” in its given role. Subtile, but still there.
But a pretty good argument could be made that Ken and Brad are not quite normal either, and the “normal” bar was exceptionally high for Watson.
Still waiting for THE answer are you? Well here it is:

Somewhere in here lies the magical element that elevates a teacher above a machine.

Or maybe it is the question you are waiting for.

What is Judgment?

Disclaimer: The views expressed in this blog post are those of the author(s) and do not necessarily reflect the official position of the National Science Teaching Association (NSTA).

Hey Watson! My dog is smarter than your phone.

It’s amazing how we put such faith into a computer where we risk national-make that global scrutiny as it preforms tasks autonomously that carry immense scientific and philosophical weight.

Somewhere in here lies the magical element that elevates a teacher above a machine.

What is Judgment?

You may also like