Watson, IBM’s supercomputer that bested the best of the best “Jeopardy!” champions is going to college — Rensselaer Polytechnic Institute (RPI) to be precise. A version of the system similar to the one used on “Jeopardy!” will be housed at RPI for three years as part of a Shared University Research Award from IBM Research. The system at RPI will have 15 terabytes of hard disk storage and give 20 users access to the system simultaneously, making it, according to a release, "an innovation hub” for the campus.
IBM is making the system available with its key goal consisting of two parts. The first is finding ways to parse through the large volume of unstructured data — information available in formats such as photographs, that cannot be indexed using existing means. “We’re going to need to and the industry — the world — is going to need to figure out how to put that unstructured data to work,” said IBM spokesman Michael Rowinski during an interview Thursday. Ninety percent of the world’s data was created in the last two years and 80 percent of that is unstructured — a gold mine for those seeking to make breakthroughs in “big data” research. The second part of the company’s main goal is to train a new group individuals on how to use cognitive systems.
Watson and RPI have a history, with many of the Watson team members having graduated from RPI. Jim Hendler, best known as one of the creators of the semantic web, is an artificial intelligence researcher who leads the computer science department at RPI. He responded to a few questions Thursday about the plans for Watson, including the system’s “curriculum,” and the potential implications for the future of unstructured data.
Q: What will be the first steps in introducing Watson to the RPI team?
A: Programming Watson requires understanding its particular flow of control in Question-Answering. For those people on campus who have not already been involved in the project, we will have several faculty, staff and students take a 2-day training course led by the IBM team, and then those people, in turn, will be able to teach others as well as jump-starting our work.
What “classes” will Watson be taking? Additionally, will this be, perhaps, the opportunity to create a “curriculum,” if you will, for other systems when it comes to processing the large volume of unstructured data out there?
We will be looking at a number of different projects that explore what Watson can do. One thing we want to explore is how Watson can interact with social media, especially things such as “tweets” where the language is not as carefully constructed as it is in the documents Watson has used in the Jeopardy game. Another thing we will be exploring is adding various kinds of numerical reasoning to Watson. There’s lots more.
So to do all this, we’re taking a two-pronged attack. One approach, utilizing our graduate students, will be exploring how to add new capabilities to Watson and how to use its current capabilities in many of our ongoing research projects. For example, I run a group that does a lot of work with Open Government Data systems (like the US data.gov) and we’re excited about the possibility of using Watson to help researchers around the world find relevant government data and documents for their work.
The second prong is to exploit the creativity of the incredible undergraduates we have here at RPI. We will be setting up a “Watson Lab” for undergraduates and getting a team of them going on doing things with it. We hope a number of small groups of students will come up with great things to do, and that will let us have many different areas being explored at the same time. We don’t know yet what will come out of this, but given some of the ideas they’ve been starting to suggest, it’s going to be great.
At the end of the three-year project, what is the ultimate goal for Watson?
Imagine having been the first university to get a telescope a few centuries back. Everywhere you pointed it was something new and exciting, and it would be impossible to predict everything you would see. Having Watson is like that for us — our goal for the next few years is to gain an understanding of what having the new ways of bringing unstructured data and documents into our computational lives will be.
Will other students outside of RPI have an opportunity to work on the system?
Right now we are working with IBM on getting everything set up for our students to get going, and of course they will keep the machine very busy for the foreseeable future. Until we have a better feel for our own needs and utilization, it is too early to answer that question.
Is there another system, aside from Watson, that you hope to be able to work with in terms of moving forward in unstructured data research.
Right now our concentration is on Watson, it is far enough ahead of its competition that it should keep us happy for now. But, we also have researchers exploring our own breakthrough technologies, and we’re upgrading our supercomputer infrastructure to allow even more, so I’d say that right now we’re a really happening place, and we’re looking forward to growing our efforts in many different ways.
Read more news and ideas on Innovations :