First, why that's not so astonishing: Computers are really good at math. In fact, you'd expect any decent computer to be able to breeze through all of the questions on the math section of the SAT.
Instead, the AI system called GeoS probably couldn't get accepted to the university that designed it, where the average math score is 580 to 700.
But here's the astonishing part: GeoS wasn't fed these problems in a language it could inherently understand. It had to read them straight off the paper, confusing diagrams and all, and interpret them the same way a human student would. After "looking" at the problem, GeoS uses the diagram and text to come up with a set of formulas that it "thinks" are most likely to correspond with the problem. Then, since those formulas are in its native tongue, the system is able to solve the questions handily. Once it has an answer it goes back to the multiple choice options and looks for one that matches its outcome.
"Our biggest challenge was converting the question to a computer-understandable language," Ali Farhadi, an assistant professor of computer science and engineering at the University of Washington and research manager at AI2, said in a statement. "One needs to go beyond standard pattern-matching approaches for problems like solving geometry questions that require in-depth understanding of text, diagram and reasoning.”
In the reported tests, GeoS wasn't always able to answer questions -- in fact, the system failed to come up with a solution about half the time. But when it was confident enough to answer, the system had a 96 percent accuracy rate.
Farhadi and his colleagues believe that this demonstration shows a step toward true AI -- a computer brain that can work just like a human's -- even if it hasn't blown us out of the water yet.
But engineers generally disagree with one another when it comes to proving computer intelligence. Last year another group claimed to have passed the Turing Test -- where a computer can successfully pass itself off as human to another human -- and was met with much skepticism. A recent piece in New Scientist suggests that a single test for AI may be an antiquated notion. Instead, many researchers now believe, we should be working on copying and evaluating individual aspects of intelligence, and trying to put them together into one brainy computer.
GeoS seems to be making strides in some very important aspects of intelligence, but we'll have to wait and see just what the system is capable of. And it's far from ready to take its place as a robot overlord.