That means many emails or essays could soon, potentially, be push-button affairs.
Basically, GPT-3 conjures a universe in which society’s overall quality of writing shoots up — a universe where those emails from that fragment-happy co-worker look a lot better — but the question of who actually is doing the writing becomes a lot more muddled. It could, in short, usher in a world where no one knows how good a communicator anyone really is.
And, well, I don’t want to give my bosses any ideas, but journalism would be a pretty good target for this.
“A lot of what we’re trying to do is solve the problem of writer’s block — to be a brainstorm buddy,” he said. “Musicians have tools so they can make music either just with computers, or just with guitars or with the two together. Photography has all these digital tools and filters too. Writers for the longest time have had spell-check and grammar check and not much else. We want to create tools to help them.”
That sounds eminently reasonable — and frightening. If news stories can be written by machine, wouldn’t it be one more task where computers can tell humans to buzz off?
I decided to test GPT-3. To do so, I turned to one of the most famous pieces of journalism in the 20th century: Gay Talese’s “Frank Sinatra Has a Cold,” which ran in Esquire in 1966.
As an exemplar of “New Journalism,” the piece combines colorful observation with immersive reporting. It’s also fiercely literary. Summarizing a news event is one thing. Being able to formulate the next paragraphs of “Frank Sinatra Has a Cold,” or even come up with sentences that sounded roughly like it, is another matter entirely.
I fed the first few sentences of the piece into Sudowrite’s engine.
I then asked Sudowrite to go down a “wormhole” — that is, come up with the next sentences based on what’s already written. It’s a game somewhat rigged against the AI since, while it has access to the entire body of human language and writing, it doesn’t have any of the specific knowledge Talese had in his head about Sinatra. It gave me:
Which, if not Talese-level, was still pretty good. Here’s what Talese had:
You can see there’s a difference, but not a huge one.
Okay, actually it was the other way around — the first selection was Talese, the second was Sudowrite.
Are his words inherently superior to the machine’s? It’s arguable.
Back down the wormhole with a few more of Talese’s sentences. Here’s a blend of Talese and AI prose.
So, you can easily tell the difference between Talese and the AI, right? Not really. Or, at least I couldn’t after walking away for a while.
(The second, third and fifth sentences were the AI’s; the first and fourth were Talese’s.)
I decided to call Talese to ask what he thought. As the phone rang I was anxious. I didn’t want to be the guy to tell Gay Talese a computer just wrote sentences that could possibly stand in for his. Why wasn’t there an AI for this task?
He answered, and I explained the experiment.
“Read them to me,” he said. I read him the AI-generated sentences.
“They’re both good,” he said after a nerve-racking pause. “It’s a correct interpretation.” He said he had no problem using tech for writing, just not for reporting. “You can’t do that with Zoom or whatever. The technology is ‘you have to be there.’ ”
In Talese-ian New Journalism, vivid descriptions — style — rule the day. And, it turns out, AI can do style. Yet what of criticism? That’s about analysis, ideas, judgment. It seemed unlikely an AI could manage those.
For this I turned to Katie Walsh, a film critic for Tribune News Service and other outlets. She recently published a review of the new Oscar Isaac movie “The Card Counter.” On Gupta’s recommendation, I first fed Sudowrite another of Walsh’s reviews (to help it “learn” her style). Then I handed it the first two paragraphs of her “Card Counter” review.
I wasn’t trying to see if it could guess what actually came next — it’s not a mind-reader. What I was looking for is whether what it came up with could have been something she wrote. Correcting for the facts, could the sentences be adapted back into her review?
I sent her prose down the wormhole:
She had written: “There’s a seething stillness to ‘The Card Counter,’ and to star Oscar Isaac’s character, William Tell. He’s perfectly coifed, not a hair out of place; his existence is spare, controlled and sparse, and this essence of the character is reflected in the film’s restrained visual style. William learned to count cards in prison, and now he leads an anonymous, ritualistic life, moving from one casino to the next, one motel to the next, plying his trade to ‘pass the time.’”
I asked Sudowrite what the review should say next.
“William’s life is ruled by math, by numbers; he speaks in them, walks in them,” it returned. “He has it down to the decimal point, this life of repetition, and he’s happy with it. William, of course, is not really happy. He’s just learned how to be numb, to turn his body into a machine, to be unaffected by his surroundings.”
What Walsh actually wrote: “William isn’t into gambling for the big money or the celebrity. Keeping track of every card in the deck is a meditative practice that wipes away any memory of anything that came before this moment on the felt.”
I ran one more test, for Walsh’s conclusion about the film. She had written that the film “implores you to listen in, but it also wants you to take a good look at William” and also said it was “pulsatingly angry about the state of the world as [the director’s previous] ‘First Reformed.’ It is a muffled primal scream about war, torture, trauma, grappling with who to blame and how to cope” while ultimately demonstrating an underlying romantic view.
I summarized crudely to see what the AI would do. It returned:
“It was less a movie than a 500,000 word novel about a man watching a movie.” Well, it looks like someone has taken a postmodernism class.
Then it offered, “The script is a mirror of the mirror of a mirror, a mirror held up to a mirror, a mirror a mirror a mirror a mirror a mirror a mirror a mirror a mirror a mirror a mirror a mirror.” I couldn’t tell if it was putting me on. Can an AI have a sense of humor?
But then it gave me:
Which is decidedly not what Walsh wrote, but is certainly something a competent critic might write.
I asked Walsh what she made of the fact that a computer program could, with her raw material, come up with something that sounded like a professional review.
She replied: “This is way better than I expected from it! It’s pretty good! I can see this now not as ‘taking my job’ (because the machine can’t watch the movie … yet), but as a tool for a writer/editor to evade writers block.” She continued, “I don’t think it’s outside the realm of possibility to take the AI paragraph and rework it, because it did successfully guess where I was going most times.”
Oddly, Gupta hasn’t optimized Sudowrite yet for nonfiction; it’s more for novelists. But he saw GPT-3 as very adaptive to journalism.
“Ultimately, it’s a tool that will move things up the chain,” he said. “As a writer, you may not need to crank out words anymore. You’re more of an editor, choosing the best versions.”
This seemed pretty scary to me, and I spent the rest of the day wondering if it was too late to enroll in trade school. But after I calmed down, I called James Bessen, the executive director of the Technology & Policy Research Initiative at Boston University’s Law School and one of the country’s leading thinkers on automation.
Does removing the painstaking bricklaying from writerly hands mean the end of journalism jobs as we know them, I asked.
He offered two historical examples in which the number of jobs went up after automation: The power loom at the beginning of the 19th century and the proliferation of ATMs at the beginning of this one. In both cases, the number of people working at banks and looms increased because automation made it easier for anyone to open such a business, and they still needed employees to work on other stuff. The quality of work also improved, because humans could now focus on higher-level problems, a tenet of automation theory.
The same could happen with journalism, he said, though he also acknowledged that the unknown was consumer demand; just because content is produced more cheaply doesn’t mean more people would want it, or want better.
Bessen also believes GPT-3, while good at style, is not so good at ideas. “It’s very impressive but it isn’t substance,” he said. His insight is evident in the descriptions about Sinatra’s groupies, which have a kind of stab-in-the-dark emptiness to them. The AI can clearly search at a zillion teraflops and come back with meaningful words. But that’s not the same as finding meaning.
The people behind GPT-3 say they see it more as an enabling tool. For example, Mira Murati, OpenAI’s senior vice president of research, product and partnerships, said she uses it to kick-start an essay that’s proving slow-going. “It’s like having a writing system that can keep up with your thinking,” she said.
She noted the AI contained tools to block outright plagiarism. But for any professional using it, the possibility of inadvertently dropping in phrases that had already been used elsewhere was, as an AI might say, nonzero.
I have to be fair to the machine though. The fact that it has no soul could be a problem. But that actually may also help us get over the terror of writing in the post-Sudowrite era. The technology won’t replace writers but augment them; rather than some writers being laid off, maybe more of them will be hired to do new things, to engage in new forms of specialization.
Wait, did I write that paragraph or did the AI?
And does it matter?