The State Department recently completed a tedious process to convert the 55,000 pages of e-mail and attachments back into a suitable electronic format. “The scanning process itself involves five steps that are time-consuming and labor-intensive,” the filing said.
More details of the process include:
- Scanning each page one-by-one, placing barcode separator sheets between documents
- Extracting their text using OCR (optical character recognition) software
- Manually entering metadata about the e-mails, including “to,” “from,” “cc,” “bcc,” “date sent,” and “subject” fields
- Identifying possible duplicate documents using computer automation
- Manually checking quality control on these processes
The State Department has 12 full-time employees dedicated to handling the request, “plus other analysts and information technology specialists who provide collateral assistance,” the filing said. Five weeks of digitizing the documents adds up to more than 2,400 man-hours of work -- at taxpayer cost, of course.
According to the court filing, only some of the 55,000 pages were double-sided, spanning at least 27,500 pieces of paper delivered in 12 bankers boxes. Factor in the barcode separator sheets the State Department used to scan the documents in (again, there were 30,000 documents to keep track of), and you’re considering a whole lot of paper.
All which could have been avoided had Clinton transferred the e-mails in a digital format. After all, e-mail is, by definition, electronic. The State Department, for what it's worth, has assured that it's not unusual to turn over the documents in paper form. And Clinton repeated Tuesday that she wants the e-mails to come out as soon as possible.
By comparison, when Jeb Bush released a cache of e-mail from his Florida governorship, he did so electronically. Of course, Bush is not a perfect example; he originally released his e-mail without redacting personal information.
But the point stands. No good comes of re-digitizing something that is natively digital. It's something political reporters deal with every three months, when the Senate processes its fundraising reports in paper form for really no good reason. (The House does so digitally.)
Here’s to hoping that when the State Department publicly releases the e-mails, it does so in a sensible electronic format, lest the documents go through yet another meticulous round of digitization.
An earlier version of this post overstated the number of man-hours spent digitizing the documents.