Page 4 of 5   <       >

The Way We Webbed

Discussion Policy
Comments that include profanity or personal attacks or other inappropriate comments or material will be removed from the site. Additionally, entries that are unsigned or contain "signatures" by someone other than the actual author will be removed. Finally, we will take steps to block users who violate any of our posting standards, terms of use or privacy policies or any other policies governing this site. Please review the full rules governing commentaries and discussions. You are fully responsible for the content that you post.

Wait, are we in 2001 or 2008?

"The Web is a mess," says Hanna. "And Web archiving is a moving target."

What's been hard for the researchers at the Archive is that the Internet has turned out to be way more democratic than experts predicted it would be back in 1996, when Web pages belonged mostly to academics and tech geeks.

With traditional archiving of books or movies, "the philosophy has been to be selective," says Kris Carpenter, a colleague of Hanna's at the Archives. "To think about what has value for lasting scholarship. With the Web, it's been very difficult to make a determination of what should be included or not included."

As of now, any Web site can apply to be included in the Wayback Machine. Visitors can view everything from PerezHilton.com -- sometimes archived five times in a single day -- to various versions of AOL.com, spanning more than a decade.

"Try the release of AOL Instant Messenger!" encouraged the site in April 1997.

The Archive recently launched a project to learn what kids today would deem our most powerful online cultural signifiers. Students elected to archive SaveDarfur.org, MyLifeIsSoAwkward.com, and SouthPark.com, among other sites. Says Hanna, "One high school decided to archive WerewolfMovies.com."

Good thing, too. A recent visit to the site showed that the domain name and content had been replaced with the far blander and less evocative Horror.com.

Coming at the archiving issue from a more historical perspective, the Library of Congress has dedicated an entire division to preserving "at-risk Web sites," those here-today, gone-tomorrow pages that need the tender care of dedicated librarians.

"Like when people run for office," says Martha Anderson, director of the National Digital Information Infrastructure and Preservation Program. After the election, the candidate might sell off the address, or abandon it. "Or some people like to pretend that they never ran for office at all," says Anderson. Companies change ownership, organizations shut down, and records of what once was disappear completely.

The library's National Digital Information Infrastructure and Preservation Program uses teams of "selection officials" and "crawl engineers" to chase after the disappearing ink of the Web, one page at a time, arranging snapshots into chronological order. In August, they launched a project to archive end-of-term governmental Web sites, capturing Bush-era ephemera before a new administration sweeps in and all the current .gov sites begin to update or go dark.

* * *


<             4        >


More From Style

[Second Glance]

Blogs

Style writers riff on music, comics and other topics.

[advice]

Advice

Get words of wisdom from Carolyn Hax, Ask Amy, Miss Manners and more.

[Cover Stories]

Reliable Source

Columnists Amy Argetsinger and Roxanne Roberts dish dirt on D.C.

© 2008 The Washington Post Company