Web Capture

In 2000, the Library of Congress established a pilot project (Web Capture project) to collect and preserve primary source web materials. A multidisciplinary team of Library staff studied methods to evaluate, select, collect, catalog, provide access to, and preserve these materials for future generations of researchers. The Library has developed thematic Web archives on such topics such as the United States National Elections of 2000, 2002, and 2004, the Iraq War, and the events of September 11, 2001.

Prior to collection of any website, the Library typically sends an email notification, which gives the website owner notice of the collection activity and of the Library's intention to include the website in its archive. For media sites, the Library usually seeks separate permission to crawl and collect the website and to provide remote access to researchers. For other sites, where the claim of fair use] is stronger, the Library provides notification that it will crawl the site and collect it unless it receives notice of a desire to opt-out. It will not provide remote access to users without express permission. The types of notices and permission requests that are sent are determined in consultation with legal counsel for the Library and depend on factors such as the type of website and content on the site, the urgency of capture, and the source of the site (U.S. or foreign).