The Transmutation of bits and bytes

IBM-logo-promoIn medieval times, Alchemists hoped to convert base metals into the noble metal gold through the use of a Philosopher’s Stone.

Today, in the field of information science, we talk about Information Alchemy, converting data into information and then into knowledge. Some people even add a 4th stage of converting knowledge into wisdom[i], but that will be for another blog post. Data is defined as the raw characters or numbers, whereas information is defined as the processing of that data into various relationships so they have some meaning. Dr. Eisenberg at the University of Washington describes knowledge as the “collected, combined, organized, processed information for a purpose.” Over time, it is thought that accumulated and refined knowledge leads to Wisdom.

This year, the total of all digital data created is forecast to reach close to 4 Zettabyes, or 4x 1021, according to IDC[ii]. This is nearly four times the 2010 volume and it is growing rapidly. All of this data should let us make a smarter and better planet. However, today we’re drowning in all this data because we don’t have the time as individuals to process all this information, and we don’t have computer systems that can turn this data into insight.

But soon that will change. We are entering a new era in computing which IBM is calling Cognitive Computing. The first of these systems is the IBM Watson system which debuted on the Jeopardy! Show 2 years ago. Traditional computing systems have done a great job with handling data, including storing it and manipulating it into information. So now we have lots of financial, inventory, customer, and all sorts of other, mostly numerical, information.

We also have lots of unstructured information such as text, audio, graphics, and video. We used to say that 80% of the new bytes being created today were associated with unstructured data, but that number is probably closer to 90% given all the video being created these days. This text and multimedia information is human-readable – in fact, it is designed by humans for humans to understand but is not easily understandable by today’s computers.

And that is a considerable problem. Today, the transformation of information into knowledge is primarily done in people’s heads. Not just by scientists, engineers, or financial analysts, but by everyone who reads an article or watches a video. The time available for people (some would say skilled people) to analyze information to gain insights (knowledge) is the limiting factor in the production of new knowledge today. To say this another way, we are now information-rich, but knowledge-poor.

The goal of the cognitive computing efforts is to remove this limitation by designing computer systems that can take this abundance of information, much of it in human readable/viewable formats, and convert into knowledge. For example, in the Jeopardy! IBM Challenge, the Watson computer system analyzed its deep information stores to find the answer that best answered the clue and the category. It did this feat by utilizing many different algorithms to attempt to “understand” the text information and a machine learning (artificial intelligence) scoring system to select the best response.

In a more significant effort, IBM is working with Memorial Sloan-Kettering and WellPoint (a major BC/BS licensee) to use cognitive computing technology to assist doctors by helping to identify individualized treatment options for patients with cancer. It is, in effect, creating knowledge of the appropriate treatment options from information about the patient’s condition and medical history, and information from clinical trials and best practices on cancer treatment.

While the field of cognitive computing is just beginning, I believe over the next several years, we will learn how to perform “Information Alchemy” and we’ll see how this newly created knowledge can benefit our agencies and our lives.

About Frank Stein
Analytics in Federal Government and Public Sector. Big Data Text Analytics Watson Threat Prediction and Prevention Risk Management Fraud and Abuse Management Social Media Analysis Healthcare analytics.