Web Archiving at the School of Information (Part II)

CIRI Blog

Published: March 14, 2016 by Alyce Scott

“The Web was not designed to be preserved. The average life of a Web page is about 100 days”
–Brewster Kahle

The Internet Archive began saving web pages in 1996, in an effort to “preserve cultural artifacts created on the web and make sure they would remain available for the researchers, historians, and scholars of the future.”  In 2001 they launched the Wayback Machine, and in 2006 Archive-it. Archive-it is “The leading web archiving service for collecting and accessing cultural heritage on the web.”

If you are not familiar with web archiving, in a nutshell, it is the process of harvesting data on the World Wide Web; storing it in a repository; preserving it; and making it available for future research. It sounds like a fairly straightforward process, doesn’t it? Make no mistake; it is complicated and full of challenges, as my students can attest.

My INFO 284 course, Tools, Services, and Methodologies for Digital Curation, includes a module on web archiving. In an earlier post, I wrote about the academic partnership that was established in August 2014, between Archive-it and the School of Information. This partnership has continued to provide my students with excellent training and experience in web archiving. The School of Information currently has 18 student-curated collections of websites available in Archive-it.

Recently, Karl-Rainer Blumenthal of the Archive-it team interviewed me, and one of my students, to discuss web archiving and what students have learned through the process. The resulting post, Teaching Digital Curation with Archive-it, can be viewed here