Feedback
1 result found
  1. Webis-Ambient-15

    • webis.de
    Published 2015
  2. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
Facebook
Twitter
Google+
Email
Click to copy link
Link copied

Webis-Ambient-15

  • Dataset published   2015
Dataset provided by
Bauhaus University, Weimarhttp://www.uni-weimar.de/
The Web Technology & Information Systems Network
Authors
Hagen, Matthias; Gollub, Tim; Busse, Matthias
Available download formats from providers
txt
,
html
Description

This corpus is an extension of the Ambient data set created by Carpineto and Romano. For each subtopic, the websites of the given URLs were downloaded (if accessible). Those documents are named as the original documents, for example, 1/1.4/1.3.html. Each subtopic was then manually enriched to ten documents with websites retrieved by Google (for example, 1/1.1/g00.html - 'g' for Google, 00 for the first Google result). Some subtopics could not be sufficently enriched and were discarded. Moreover, some subtopics were duplicates or not interpretable and were also discarded.

Search
Clear search
Close search
Google apps
Main menu