Feedback
2 results found
  1. Webis-WikiDiscussions-18

    • webis.de
    • temir.org
    • +1more
    Published 2018
  2. Webis-WikiDebate-18

    • webis.de
    • temir.org
    Published 2018
  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
Facebook
Twitter
Email
Click to copy link
Link copied

Webis-WikiDiscussions-18

  • Dataset published 2018
Dataset provided by
Bauhaus University, Weimarhttp://www.uni-weimar.de/
The Web Technology & Information Systems Network
Authors
Herpel, Jakob; Stein, Benno; Al-Khatib, Khalid; Hagen, Matthias; Wachsmuth, Henning; Lang, Kevin
Available download formats from providers
tsv
Description

Webis-WikiDiscussions-18 Corpus is the output of parsing the entire set of Wikipedia talk pages. The corpus contains about six million discussions, consisting of about 20 million turns. The turns comprise around 74,000 different tags with a total of about 100,000 instances, around 7000 different shortcuts with about 400,000 instances, and around 51,000 different inline templates with about 3.3 million instances.

Search
Clear search
Close search
Google apps
Main menu