2 datasets found
  1. W

    Webis-Editorials-16

    • webis.de
    • zenodo.org
    3254405
    Updated 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Steve Göring; Henning Wachsmuth; Johannes Kiesel; Matthias Hagen; Benno Stein (2016). Webis-Editorials-16 [Dataset]. http://doi.org/10.5281/zenodo.3254405
    Explore at:
    3254405Available download formats
    Dataset updated
    2016
    Dataset provided by
    Bauhaus-Universität Weimar
    Paderborn University
    Friedrich Schiller University Jena
    The Web Technology & Information Systems Network
    Authors
    Steve Göring; Henning Wachsmuth; Johannes Kiesel; Matthias Hagen; Benno Stein
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Webis-Editorials-16 corpus is a novel corpus with 300 news editorials evenly selected from three diverse online news portals: Al Jazeera, Fox News, and The Guardian. The aim of the corpus is to study (1) the mining and classification of fine-grained types of argumentative discourse units and (2) the analysis of argumentation strategies pursued in editorials to achieve persuasion. To this end, each editorial contains manual type annotations of all units that capture the role that a unit plays in the argumentative discourse, such as assumption or statistics. The corpus consists of 14,313 units of six different types, each annotated by three professional annotators from the crowdsourcing platform upwork.com.

  2. E

    Webis EditorialSum Corpus 2020

    • live.european-language-grid.eu
    • zenodo.org
    csv
    Updated Oct 19, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). Webis EditorialSum Corpus 2020 [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7658
    Explore at:
    csvAvailable download formats
    Dataset updated
    Oct 19, 2020
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Webis EditorialSum Corpus consists of 1330 manually curated extractive summaries for 266 news editorials spanning three diverse portals: Al-Jazeera, Guardian and Fox News. Each editorial has 5 summaries, each labeled for overall quality and fine grained properties such as thesis-relevance, persuasiveness, reasonableness, self-containedness.The files are organized as follows:corpus.csv - Contains all the editorials and their acquired summariesNote: (X = [1,5] for five summaries)- article_id : Article ID in the corpus- title : Title of the editorial- article_text : Plain text of the editorial- summary_{X}_text : Plain text of the corresponding summary- thesis_{X}_text : Plain text of the thesis from the corresponding summary- lead : top 15% of the editorial's segments- body : segments between lead and conclusion sections- conclusion : bottom 15% of the editorial's segments- article_segments: Collection of paragraphs, each further divided into collection of segments containing: { "number": segment order in the editorial, "text" : segment text, "label": ADU type }- summary_{X}_segments: Collection of summary segments containing:{ "number": segment order in the editorial, "text" : segment text, "adu_label": ADU type from the editorial, "summary_label": can be 'thesis' or 'justification'}quality-groups.csv - Contains the IDs for high(and low)-quality summaries for each quality dimension per editorialFor example: article_id 2 has four high_quality summaries (summary_1, summary_2, summary_3, summary_4) and one low_quality summary (summary_5) in terms of overall quality.The summary texts can be obtained from corpus.csv respectively.

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Steve Göring; Henning Wachsmuth; Johannes Kiesel; Matthias Hagen; Benno Stein (2016). Webis-Editorials-16 [Dataset]. http://doi.org/10.5281/zenodo.3254405

Webis-Editorials-16

Explore at:
11 scholarly articles cite this dataset (View in Google Scholar)
3254405Available download formats
Dataset updated
2016
Dataset provided by
Bauhaus-Universität Weimar
Paderborn University
Friedrich Schiller University Jena
The Web Technology & Information Systems Network
Authors
Steve Göring; Henning Wachsmuth; Johannes Kiesel; Matthias Hagen; Benno Stein
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The Webis-Editorials-16 corpus is a novel corpus with 300 news editorials evenly selected from three diverse online news portals: Al Jazeera, Fox News, and The Guardian. The aim of the corpus is to study (1) the mining and classification of fine-grained types of argumentative discourse units and (2) the analysis of argumentation strategies pursued in editorials to achieve persuasion. To this end, each editorial contains manual type annotations of all units that capture the role that a unit plays in the argumentative discourse, such as assumption or statistics. The corpus consists of 14,313 units of six different types, each annotated by three professional annotators from the crowdsourcing platform upwork.com.

Search
Clear search
Close search
Google apps
Main menu