On Monday, a technology startup in London did something that most software companies never do: It published the code behind its creation so that anyone could replicate it. Any developer in the world can rebuild the image-generating model made by Stability AI, which can spit out any picture or photo you can imagine from a single text prompt.
The tool is almost magical — creepy, even — in what it can do. Want an image of a short-haired English blue cat playing guitar?
But here’s how this tool is potentially groundbreaking compared to DALL-E 2, a similar program that San Francisco-based OpenAI launched earlier this year, which hundreds of people have used to make wacky art. Stability AI’s is free to replicate and has very few restrictions. DALL-E 2’s code hasn’t been released, and it won’t generate images of specific individuals or politically sensitive topics such as Ukraine, to prevent the software from being misused. By contrast, the London tool is a veritable free-for-all.
In fact, Stability AI’s tool offers huge potential for creating fake images of real people. I used it to conjure several “photos” of British Prime Minister Boris Johnson dancing awkwardly with a young woman, Tom Cruise walking through the rubble of war-torn Ukraine, a realistic-looking portrait of the actress Gal Gadot and an alarming image of London’s Palace of Westminster on fire. Most of the images of Johnson and Cruise looked fake, but some looked like they could pass muster with the more gullible among us.
Stability AI said in its release on Monday that its model includes a “safety classifier,” which blocks scenes of a sexual nature but can also be adjusted or removed entirely as the user sees fit.
Stability AI’s founder and Chief Executive Officer Emad Mostaque says he’s more worried about public access to AI than the harm his software could cause. “I believe control of these models should not be determined by a bunch of self-appointed people in Palo Alto,” he told me in an interview in London last week. “I believe they should be open.” His company will make money by charging for special access to the system, as well as from selling licenses to generate famous characters, he said.
Mostaque’s release is part of a broader push to make AI more freely available, reasoning that it shouldn’t be controlled by a handful of Big Tech firms. It’s a noble sentiment, but one that also comes with risks. For instance, while Adobe Photoshop may be better at faking an embarrassing photo of a politician, Stability AI’s tool requires much less skill to use and is free. Anyone with a keyboard can hit its refresh button over and over until the system, known as Stable Diffusion, spits out something that looks convincing. And Stable Diffusion’s images will look more accurate over time as the model is re-built and re-trained on new sets of data.(1)
Mostaque’s answer is that we are, depressingly, in the midst of an inevitable rise in fake images anyway, and our sensibilities will simply have to adjust. “People will be aware of the fact that anyone can create that image on their phone, in one second… People will be like, ‘Oh it’s probably just created,’” he said. In other words, people will learn to trust the Internet even less than they already do and the phrase “pics or it didn’t happen,” will evolve into “pics don’t prove anything any more.” Even so, he anticipates that 99% of people who use his tool will have good intentions.
Now that Mostaque’s model has been released, social media firms like Snap Inc. and Byte Dance Inc.’s TikTok could replicate it for their own platforms. TikTok, for instance, recently added an AI tool for generating background pictures, but it’s highly stylized and doesn’t do specific images of people or objects. That could change if TikTok decides to use the new model. Mostaque, a former hedge fund manager who studied computer science at Oxford University, said that developers in Russia had already replicated it.
Mostaque’s open-source approach runs counter to how most Big Tech firms have handled AI discoveries, driven as much by intellectual property concerns as public safety. Alphabet Inc.’s Google has a model called Imagen whose creations look even more realistic than OpenAI’s DALL-E 2, but the company won’t release it because of the “potential risks of misuse.” It says it’s “exploring a framework” for a potential future release, which may include some oversight. OpenAI also won’t release details about its tools for anyone to copy. (2)
Monopolistic technology companies shouldn’t be the sole gatekeepers of powerful AI because they’re bound to steer it towards their own agenda, whether that’s in advertising or keeping people hooked on an endless scroll. But I’m also uneasy about the alternative idea of “democratizing AI.” Mostaque himself has used this phrase, an increasingly popular one in tech.(3)
Making a product affordable or even freely available doesn’t really fit the definition. At its heart, democracy relies on governance to work properly, and there’s little evidence of oversight for tools like Stable Diffusion. Mostaque says that he relied on a community of several thousand developers and supporters who deliberated on the chat forum Discord about when it would be safe to release his tool into the wild. So that’s something. But now that Stable Diffusion is out, its use will be largely un-policed.
You could argue that putting powerful AI tools into the wild will contribute to human progress in some way, and that Stable Diffusion will transform creativity as Mostaque predicts. But we should expect unintended and unforeseen consequences that are just as pervasive as the benefits of making anyone an AI artist, whether that be a new generation of misinformation campaigns, or new types of online scams, or something else entirely.
Mostaque won’t be the last person to release a powerful AI tool to the world and, if Stability AI hadn’t done it, someone else would have. That race to be the first to bring powerful innovation to the masses is partly what’s driving this grey area of software development. When I pointed out the irony of his company name given the disruption it will likely cause, he countered that “the instability and chaos was coming anyway.” The world should brace for an increasingly bumpy ride.
More From Bloomberg Opinion:
• Who Needs the Government to Explore Deep Space?: Adam Minter
• Robots Are Key to Winning the Productivity War: Thomas Black
• Can India Get Lending-by-App Under Control?: Andy Mukherjee
(1) Releasing the system’s “weights” on Monday means that anyone could fine tune the calibration to make it more accurate in certain areas. For instance, someone with a large cache of images of Donald Trump could retrain the model to conjure much more accurate “photos” of the former U.S. President, or anyone else.
(2) OpenAI started in 2015 as a non-profit organization whose goal was to democratize AI, but running AI systems requires powerful computers that cost hundreds of millions of dollars. To solve that, OpenAI took a $1 billion investment from Microsoft Corp. in 2019, in return for giving the tech giant first rights to commercialize any of OpenAI’s discoveries. OpenAI has since released fewer and fewer details about new models such as DALL-E 2, often to the consternation of some computer scientists.
(3) Among the many examples of the trope, Robinhood Markets Inc. wants to “democratize finance” (it makes an app for trading stocks and crypto assets) while the controversial startup Clearview AI wants to “democratize facial recognition.”
This column does not necessarily reflect the opinion of the editorial board or Bloomberg LP and its owners.
Parmy Olson is a Bloomberg Opinion columnist covering technology. A former reporter for the Wall Street Journal and Forbes, she is author of “We Are Anonymous.”
More stories like this are available on bloomberg.com/opinion
©2022 Bloomberg L.P.