As computers get better at navigating the world around them, they are also helping humans better navigate that world as well. Thanks to advances in AI and robotics, scientists from IBM Research and Carnegie Mellon University (CMU) are working on new types of real-world accessibility solutions for the visually impaired.
The goal is as audacious as it is inspiring: coming up with a technological platform that can help the visually impaired navigate the world around them as effortlessly as everyone else. The first pilot in the program is a smartphone app for iOS and Android called NavCog, which helps blind people navigate their surroundings by whispering into their ears through earbuds or by creating subtle vibrations on their smartphones. (Users have the option of either setting the app to “voice mode” or “vibration mode.”)
Similar to the turn-by-turn directions offered by car GPS systems, the app offers its own version of turn-by-turn directions for the visually impaired. The app analyzes signals from Bluetooth beacons located along walkways and from smartphone sensors to help enable users to move without human assistance, whether inside campus buildings or outdoors.
The magic happens when algorithms are able to help the blind identify in near real-time where they are, which direction they are facing and additional surrounding environmental information. The computer vision navigation application tool turns smartphone images of the surrounding environment into a 3-D space model that can be used to issue turn-by-turn navigation guidance.
The NavCog project, which is a joint collaboration between IBM Research in Yorktown Heights, N.Y. and Carnegie Mellon, has particular meaning for one of the lead researchers on the project, IBM Fellow and visiting CMU faculty member Chieko Asakawa, who is visually impaired herself. It will soon be possible for her to walk across the CMU campus with the help of the NavCog app – and look just like any other person traversing the campus, listening to a smartphone with white earbuds in her ear.
That’s just the beginning, as Kris Kitani of the Robotics Institute of CMU told me. One major goal, of course, is to extend the coverage beyond just the buildings on the Carnegie Mellon campus that have been retrofitted with beacons. To encourage this, the scientists working on the project have made the entire NavCog platform open source by making it available to developers via the IBM BlueMix cloud. That makes it possible for other developers to develop other enhancements for the system and speed the rollout to other physical destinations.
The other primary goal, says Kitani, is to make the system possible for any environment, even one that does not include Bluetooth beacons. To make that possible, the university hopes to build on advances in computer vision as well as new work being conducted in the field of cognitive assistance, which is a research field dedicated to helping the blind regain information by augmenting missing or weakened abilities.
By using cameras for computer-aided vision, for example, it might be possible to develop a more accurate system that doesn’t require the presence of Bluetooth beacons. Moreover, this computer-aided vision, when combined with other localization technologies, potentially could make it possible to recognize everyday landmarks — such as a set of stairs or a barrier on the road — that might not be picked up with today’s sensors.
“From localization information to understanding of objects, we have been creating technologies to make the real-world environment more accessible for everyone,” said Martial Hebert, director of the Robotics Institute at Carnegie Mellon. “With our long history of developing technologies for humans and robots that will complement humans’ missing abilities to sense the surrounding world, this open platform will help expand the horizon for global collaboration to open up the new real-world accessibility era for the blind in the near future.”
Thanks to the cross-fertilization of ideas across AI and robotics at Carnegie Mellon, there are plans afoot to add other extras to the system that go beyond just mere navigation. For example, a facial recognition component would tell you in real-time if you are passing someone you already know.
Moreover, sensors capable of recognizing emotions on these faces — work that’s part of other Carnegie Mellon research into autism – could make it possible to recognize when those people passing you are smiling or frowning. Researchers also are exploring the use of computer vision to characterize the activities of people in the vicinity and ultrasonic technology to help identify locations more accurately.
As Asakawa shared with me, the cognitive assistance research that went into creating the NavCog app has some parallels with the cognitive computing work being performed by IBM Watson. In both cases, there is a growing attempt to improve the cognitive abilities of humans on a real-time basis.
Within IBM, for example, researchers sometimes use the concept of “Watson on my shoulder” to explain what’s next for IBM Watson — a continuous, localized presence that can provide cognitive assistance for just about anyone, including medical professionals and weather forecasters.
If all goes according to plan, it’s possible to envision a virtuous feedback loop for machine intelligence and human intelligence, in which cognitive technologies developed by humans to augment the capabilities of machines end up by augmenting the capabilities of humans as well.