Rochelle Safo has spent so long at the digitization belt, she hears it in her dreams: The “click” of the camera, the “beep” of the computer, the “whir” of the belt as it conveys the next specimen beneath the camera's lens.
Click. Beep. Whir. Click. Beep. Whir.
The sounds fill the windowless room deep in the bowels of the National Museum of Natural History where she works. Eight hours a day, five days a week, every week for the past 16 months, Safo has helped operate a huge conveyor belt designed to digitize the museum's vast botany collection. Deftly, she and her two fellow digitizers place papers bearing pressed plants on the belt, pass them under a camera, snap a photo, check the image on the computer, then replace the sheets in their folder. Click. Beep. Whir.
“It's definitely not what I had imagined,” admitted the 24-year-old Safo. She'd gone to Georgetown to get a master's degree in museum studies with the hope of becoming a curator. But when she graduated and started looking for jobs at the Smithsonian, this was what was available.
The seemingly monotonous task turned out to be essential. “There's a lot more that goes on behind the scenes than you think,” Safo said. “The fact that we all manage to stay sane is really what makes the museum run.”
The NMNH botany collection, or herbarium, is among the largest in the world, filling more than two floors of the museum with massive palm fronds, tiny slides containing microscopic algae, coconuts heavier than a human toddler, pressed plants older than the United States, and objects associated with some of the famous names in the history of science and exploration — Charles Darwin, Captain James Cook, President Theodore Roosevelt.
All told, there are more than 5 million specimens spanning the past three centuries of plant life on Earth. Some 3.5 million of those objects are “press sheets” — dried and preserved plants mounted on stiff white paper and meticulously labeled with information about when and where they were found.
“It is a record of what existed,” said Department of Botany chief Laurence J. Dorr. And since 20,000 to 30,000 new specimens are added each year, “it will stand as a record of what exists today.”
But for years, the record was locked away in cabinets, difficult to search and impossible to analyze on large scales. And at the plodding rate of “traditional digitization” (in which press sheets are scanned and identifying information is entered into a database by hand), any effort to turn the physical archive into a digital one seemed Sisyphean. New objects were added to the collection so rapidly the museum couldn't keep up. It took 40 years to catalogue the first 1.5 million items in the herbarium.
Then the Digitization Program Office got involved. Next month, the botany staff will notch their next million digitized objects. It took less than a year and a half.
“I go back to the automobile,” said Ken Rahaim, who oversees mass digitization for the entire Smithsonian. “We borrowed a lot from Henry Ford, kind of creating a more industrialized process to do the mass digitization that we do here at the scale that we do it at. We're taking specialists and putting them at the tasks they're all most efficient at.”
Assembly line techniques aren't all that were required for the mass digitization program. The NMNH project uses a conveyor belt and high-speed camera developed in the Netherlands; it is the biggest project to use the technology in the United States. (The Dutch Naturalis Biodiversity Center employed the technology to digitize 7 million specimens in less than five years.) The project also relies on the efforts of about a dozen transcribers based in Suriname, who enter information from the press sheet labels into the database. Fans of the collection, or people who just enjoy mildly tedious typing, also can pitch in via the museum's online transcription center.
Some five dozen people are involved in the process, said Sylvia Stone Orli, digitization manager for the botany department. When the initiative won a museum award last year, so many team members got up to receive it, they barely fit on the stage.
This is the reality of working at a natural history museum, Orli said. Even collectors and curators spend most of their time at their desks, archiving and analyzing data. One of Orli's colleagues in invertebrate zoology used to have a series of photos pinned to his door showing pictures of scientists in various swashbuckling situations with captions such as, “What my friends think I do” and “What my mom thinks I do.” The final photo, captioned “What I actually do,” shows a woman sitting in front of a computer.
“It's really funny, because museum work — what they're doing on the conveyor belt and what a lot of us do — is just like daily, monotonous things to get the work done,” Orli said.
Rahaim interjected, “But that's the foundation for the data that generates the information that actually creates the knowledge that makes the breakthrough. … Any 'eureka' moment comes though a lot of foundational hard work at that level.”
Dorr agreed. Digitization, he said, will give researchers access to collections on opposite sides of the world. It will allow scientists to use computer algorithms to simultaneously study more specimens than could ever be examined by hand. He imagines a world in which all museum collections are digitized and posted online, and the entire body of human knowledge about nature is available at the click of a button.
“As we do more and other collections do more, I think we're on the cusp of revolutionizing the questions we can ask about the natural world,” he said.
Already, digitization has answered a more mundane question. Several years ago, the botany department lost track of a press sheet containing a purple fireweed collected in Yellowstone in 1883. The specimen was unremarkable except for the fact that it was collected by President Chester A. Arthur — it was the only object he contributed to the museum. So every few years, someone would wonder where it went.
In December 2015, a few months into her gig at the conveyor belt, Safo spotted the delicate preserved flower and beneath it, Arthur's name.
“We see a lot of really cool ones,” she said, then listed off famous names she'd seen inked onto faded labels: Roland Bonaparte, grandnephew of Napoleon; President Franklin Delano Roosevelt.
This too is part of the appeal of working on the belt. With 920,928 items digitized (as of Tuesday night), Safo and her colleagues have handled more specimens from the herbarium than anyone else in the botany department. It's given them a personal, tactile relationship with the collection, one that feels worth the long hours of tedious work.
And it's not always dull. Safo and her fellow digitizers pass the time with music and conversation. They listen to podcasts. They race to see if they can beat past records for most items photographed. On a good day, they are as much like machines as the conveyor belt itself, their hands moving fluidly, each person's movements in perfect synchrony with those of her colleagues.
Click. Beep. Whir.
Tales From the Vault: Science museums are home to vast research collections, most of which the public never gets to see — until now. Once a month, Speaking of Science will go behind the scenes at our favorite museums to introduce readers to the fascinating objects and people we find there. Read past installments here.