Important terms and expressions
The Internet
is an electronic communications network that connects computers around the world.
merriam-websterThe web
is the part of the Internet that can be accessed by a browser. For instance, email and apps are also part of the Internet, but not the web.
merriam-websterWebsite
is a form of online publication, the ensemble of several pages linked together and browsable on the Internet.
merriam-websterWeb archiving
is the practice of downloading and archiving parts of the web in order to preserve its contents and ensure long term access to information.
Domain
is a subdivision of the Internet denoted in an address with a unique abbreviation (such as .lu, or .com).
merriam-websterSeed
is a URL-address, used as a starting point for web crawls. One seed can lead to a number of different pages, so the more seeds are “sown”, the more extensive the results of a web harvest will be.
Seed list
comprises all the seeds that were used to build a collection. This list will give you an idea which websites can be found in the collection, however it doesn’t necessarily mean that every page of every website was captured.
Harvest
describes the process of crawling and downloading parts of the Internet, often used as a synonym for web crawl in the context of web archiving.
Web crawler
also called spider, scans every element of a website, following every link and tracing every component on every page. Crawlers are also used for web-indexing by search engines, allowing for faster and more efficient search results by frequent crawls.
Collection policy
is the description of standards and procedures followed while building a collection. A detailed policy helps in understanding the contents and limitations of a collection and informs the user about the web archive’s operating principles.
Broad crawls and targeted crawls
Broad crawls capture a snapshot of a large number of seeds, in our case all .lu domains, which we capture twice a year.
Targeted crawls aim at a specific topic or event, potentially with a higher frequency of captures of a smaller number of seeds.
Missing something?
What terms and expressions do you think are missing from this page?
Help us in expanding the Luxembourg Web Archive dictionary by sending in your questions and suggestions.
Also remember that we are looking for contributions of new and noteworthy websites to be included in the archive. Simply contact us, or use the submission form under “participate and contribute” below: