In this paper we present a free dataset, usable for testing web search engines. The dataset corresponds to a snapshot of the Nordic part of the Internet back in early 2007 and is highly abstracted, with numbers representing each web page. The released dataset consists of three parts; a graph, 76 sets of pages containing each tested word combination, and some files to use when calculating relevance of the resulting sets of algorithms/search engines. We also present statistics for some search engine algorithms.
Page Responsible: Frank Drewes 2024-11-21