“Anything you want to keep will need to be downloaded by that date,” Yahoo announced.
The protests and pleas for more time were just starting when Jason Scott took to Twitter to register his utter lack of surprise over the fate of Yahoo’s sprawling chitchat of neighborhoods, businesses, addicts in recovery and birdwatchers.
The team of volunteers Scott founded — “rogue archivists, programmers, writers and loudmouths dedicated to saving our digital heritage” — has spent a decade hopping from one online obliteration to the next, capturing whatever they can in a public repository called the Wayback Machine. The Archive Team, as his group is known, keeps a “Deathwatch” of websites in various stages of shutdown (“Likely to Die,” “Dying,” “Dead as a Doornail”); Yahoo discards feature prominently.
“OF COURSE Yahoo! is going to delete/wipe ALL content in ALL Yahoo! Groups in 3 months,” Scott typed Oct. 16, after someone emailed him to share the strange note they had found on their site menu.
It was just the latest crisis for an ad hoc network of digital archivists who think we will one day wish we had been better record keepers of a world increasingly lived online. Those archivists say Yahoo has blocked their attempts at coordinated preservation of the Yahoo Groups forums, deepening their frustrations.
“If you burn down a house, you can still kind of make out the foundations,” Scott told The Washington Post. “But with a digital website, it’ll just be gone.”
The move to scrub the Yahoo Groups archives may be infuriating digital historians, but the company says it is just adapting to an ever-shifting Internet landscape.
“Today, most Yahoo Groups activity happens in your email inbox, not on the bulletin boards where Yahoo Groups started in the pre-smartphone age,” an email to users explained last month. “Increasingly, people want content and connections coming directly to them.”
Yahoo Groups says it is working to “streamline” its services to focus on what people actually use. It will continue to function as an email list even as people lose the ability to browse or post on the old Web-based forums.
It is, in effect, simply doing what many users did long ago — abandoning a website past its prime.
Yahoo’s owner, Verizon Media, is proceeding with plans to pull the plug on its sprawling network of forums Saturday, though a company official told The Post that the deadline for users to retrieve personal data has been extended to Jan. 31.
All the while, protests have been mounting. A blog called Yahoo Groups Crusade Headquarters has been lamenting the “bulldozer headed for us,” and open letters have decried Big Tech’s “callous disregard for the value of user’s memories, experiences, and history.”
A Yahoo tool helps people download their own groups’ content but leaves out attachments and photos uploaded by others. Some critics say it spits out a jumbled file that most do not know how to open, as Yahoo suggests searching phrases such as “JSON parser” for help.
Yahoo Groups will “continue to listen to feedback to ensure we keep our users happy,” Verizon Media said in a statement.
Margaret Farrelly is part of the majority that uses Yahoo Groups through email these days. She says she has scaled down from hundreds of group memberships to just a few.
But she was still alarmed when Scott’s tweet about Yahoo Groups made it into her feed. The Wisconsin resident knows the site was an early home for something she loves: fan fiction. She belongs to a collective that has identified about 35,000 fandom-related Yahoo groups to save.
Farrelly also knows someone trying to capture more than 100 beekeeping groups, and someone passionate about genealogy and adoption support circles. Another person is focusing on astronomy groups filled with photos they say may not exist anywhere else.
“It’s just depressing to think how much is going to be lost,” Farrelly said.
Her fandom group turned to the Archive Team for help.
Scott estimates the team’s numbers at a few hundred people, but there is no official roster. People come and go as they upload to the Wayback Machine, the massive repository run by the nonprofit Internet Archive, where Scott is a staffer.
There, the snapshots become part of giant public record of the Web’s changes, growing at nearly a billion URLs a day and stored on hulking servers in what was once a California church nave. The archives are backed up for security around the world — including in the ancient Egyptian capital of Alexandria, where a famed library burned down in a testament to the vulnerability of collected knowledge.
Scott’s instincts for preservation go back to his early years online, when he would keep stacks of floppy disks with messages from the dial-up “bulletin board systems” of the 1980s. As BBSes became obsolete, Scott memorialized the technology and the communities it supported by collecting files on a website. He also made an eight-episode film (simply titled “BBS: The Documentary”) and got fan mail.
That niche mission has become overwhelming in its enormity.
How do you preserve the Internet?
Where do you start?
Mark Graham, director of the Wayback Machine, says the tool comprises nearly 400 billion Web pages and accounts for about half of the nonprofit Internet Archive’s 60 petabytes of stored content. That is a lot of data. A petabyte, equal to more than a million gigabytes, is sometimes equated to 10 million filing cabinets of text.
But in the grand scheme of the Internet, it is also small. Yahoo Groups could clock in at several petabytes, Graham guesses — though he compares the act of estimation to walking into a library you can’t see the end of and trying to guess how many words it contains.
And so, Graham said, the Internet Archive tries to focus its efforts, prioritizing public interest data (government accounts, news sites, academic papers, Wikipedia references) to prepare for the day when, say, Turkey orders the shutdown of more than 100 media outlets.
Next to those public interest targets, Yahoo Groups for birdwatching and beekeeping might seem inconsequential. But, Scott says, it is all part of “saving human culture.”
Racing to beat the Saturday deadline, Archive Team members clicked through Captcha after Captcha of traffic light photos to join the groups they planned to copy over — some 65,000 as of last week, as they hashed through bugs and slow downloads in chats with names like “yahoosucks.”
Some volunteer archivists were pulling late nights after work; others pitched in while studying for exams.
Then came the message on the night of Dec. 3: “when did yahoo deactivate some of our accounts?” somebody chatted.
Realizing that the combined number of groups they had joined had plummeted to about 15,000, the archivists arrived at a disheartening conclusion: Their accounts, it seemed, had suddenly been blocked by Yahoo.
Verizon Media told The Post that the organization’s “recent actions” violated its Terms of Service but did not elaborate or respond to additional questions.
Recruiting volunteers to manually rejoin as many groups as possible, the Archive Team has clawed its way back to 42,000 groups. As the Saturday shutdown approaches and the chances of archiving everything grow dimmer by the day, Scott — who once told a crowd that his group runs on “rage, paranoia and kleptomania” — says he is focusing on the positive.
He says he is thinking about the way Yahoo Groups has energized people, how a fluctuating group of volunteers has swelled and how word of their mission has spread. Visits to the Yahoo Groups Crusade Headquarters blog spiked to an unprecedented 15,000 on Sunday, owner Brenda Fowler says.
Scott is also thinking beyond Yahoo Groups to the dozens of smaller sites his team preserves each day.
There are fan sites for the Swedish singer Marie Fredriksson, who died Monday. And there is the clunky-looking website from the 1990s that a stranger just emailed him. Most of the site is in Greek, which Scott does not understand. All he can read is the English banner atop its homepage in anticipation of its demise.
“This service is approaching End Of Life,” the message says, adding: “Please keep local copies of any important files you have uploaded!”