The Washington PostDemocracy Dies in Darkness

The very best ideas for preventing artificial intelligence from wrecking the planet

Once humanoid robots become more intelligent, what will they think of their human peers? (Issei Kato/Reuters)

The Boston-based Future of Life Institute, backed by a $10 million donation from Elon Musk, recently announced its list of 37 winners of research grants in the field of artificial intelligence. Spurred by concerns from luminaries such as Musk, Stephen Hawking and Bill Gates that we’re ill-prepared for the coming age of machine super-intelligence, the grants — ranging in size from $20,000 to $1.5 million — are part of a bigger plan to prevent AI from wrecking the planet.

At the very least, one hopes, the ideas and concepts being explored in these winning AI grants might help prevent some of the “unintended” and “disastrous” consequences hinted at by the Future of Life Institute earlier  — such as robot homicides in factories or road collisions involving self-driving cars.

1. Keeping super-smart weapons systems under human control

When most people think about killer AI taking over the planet, they usually think of a “Terminator”-like scenario populated by rogue cyborgs, Skynet and an epic battle between man and machine. While even the Future of Life Institute admits that a “Terminator” future for AI confuses fact and reality, there is a real need to make sure that super-smart autonomous weapons systems don’t start overriding their human masters in the future.

Which might be why one of the grants highlighted by the Future of Life Institute was a $136,918 grant to University of Denver visiting professor Heather Roff Perkins, who is studying the links between “Lethal Autonomous Weapons, AI and Meaningful Human Control.” According to the project’s summary, once autonomous weapons systems (think military drones and battlefield bots) start to become superintelligent, there’s always a risk that they will start to slip the bonds of human control, and in so doing, “change the future of conflict.”

2. Making AI systems explain their decisions to humans in excruciating detail

At some point, computers are going to far surpass the intellectual capacity of their human operators. When that day happens, we’re going to need to know how they think and all the little assumptions, inferences and predictions that go into their final decisions. That’s especially true for complex AI autonomous systems that integrate sensors, computers and actuators – all of these systems will be able to process and make decisions about much more data than humans are capable of analyzing by themselves.

As a result, Professor Manuela Veloso of Carnegie Mellon University received a $200,000 grant to find ways to make complex AI systems explain their decisions to humans. As she suggests, the only way to make them truly accepted and trusted is if we make these AI systems completely transparent in their decision-making process. This may not be a big deal if it’s a matter of challenging your Internet of Things device why it turned off the lights at home, but a much bigger deal if you’re relying on AI medical assistants to prescribe medications or treatments.

3. Aligning the interests of machines and humans

Once computers become superintelligent, they are going to have very specific interests in mind. They may not be afflicted by classic human failings – envy, lust, greed – but they may be driven by purely algorithmic factors, including a need for more resources. Just watch the Hollywood dystopian film “Transcendence” to get an idea of what happens when an AI machine demands more and more resources to fulfill its goals – it doesn’t end well for humanity.

In order to align the interests of superintelligent systems with those of humans, Benja Fallenstein of the Machine Intelligence Research Institute is using a $250,000 grant to study how to override or reprogram machines to bring them into alignment with humanity’s interests. That could be tough if machines become imbued with a sense of their own infallibility — they might resist efforts by human programmers to fix errors or tweak initial mission goals. Fallenstein talks of the need for “corrigible agents” doing the bidding of humans, rather than, one presumes, “incorrigible agents” naughtily undermining the work of humans without us knowing.

4. Teaching machines about human behaviors and values

It may seem like common sense that machines, programmed by humans, will think like us and act like us. Presumably, that would be enough to keep them from launching a robot rebellion. However, the big problem is that humans aren’t exactly the most rational creatures – we suffer from all kinds of biases and preconceptions. We procrastinate and we’re prone to impulsive, if not downright addictive, behaviors. Pity the poor machine that tries to learn from us in order to develop rules about the known universe.

As a result, Owain Evans of the University of Oxford is attempting to develop techniques to help AI systems learn about human preferences from observing our behaviors. That means understanding all of our human foibles and what makes us tick, to make sure machines are not learning the wrong things from us. Machines live in a rational world of optimal decision-making, humans don’t. As Evans suggests, the ability to infer human values can be broken down to the following: learning the difference between “ought” and “is” when it comes to decision-making. For machines, it’s a case of Do as I say, not as I do.

5. Ensuring an economic future where humans still have jobs

Like it or not, we’re transitioning to an “AI economy” where much of the work we do is fully automated – not just the blue collar factory jobs, but also the white collar work we typically associate with office professionals. To make that transition as smooth as possible, Stanford’s Michael Webb is studying how to keep the economic impacts of AI beneficial.

That’s actually harder than it sounds, because the fully automated economy may change the economic incentives for humans. At some point, says Webb, humans may actually have an incentive to stop technological progress. The easiest way to think about this is if an AI robot were to show up at your job tomorrow, claiming that it could do the work you did — only better, faster and cheaper. Would you go down without a fight?


All of these winning AI grant proposals hint at the complex ethical and philosophical questions at the heart of AI. They also suggest that the real question is not about machines, but rather, about humans. Will we be smart enough to design an AI future that is “robust and beneficial”?

The difficult of answering that question might explain why the biggest grant of all ($1.5 million) was awarded to the University of Oxford’s Nick Bostrom, for his plan to create a joint Oxford-Cambridge research institute for artificial intelligence. That’s the same Nick Bostrom who helped to helped to kick off the scare around AI with his 2014 book “Superintelligence,” which proposed a hypothetical scenario in which machines decide that humans are just expendable widgets as they carry out a misguided plan to convert the Earth into a giant paper clip factory. If that’s really a possibility, let’s hope that this AI research center gets built as soon as possible.