Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Webis-Editorials-16 corpus is a novel corpus with 300 news editorials evenly selected from three diverse online news portals: Al Jazeera, Fox News, and The Guardian. The aim of the corpus is to study (1) the mining and classification of fine-grained types of argumentative discourse units and (2) the analysis of argumentation strategies pursued in editorials to achieve persuasion. To this end, each editorial contains manual type annotations of all units that capture the role that a unit plays in the argumentative discourse, such as assumption or statistics. The corpus consists of 14,313 units of six different types, each annotated by three professional annotators from the crowdsourcing platform upwork.com.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Webis EditorialSum Corpus consists of 1330 manually curated extractive summaries for 266 news editorials spanning three diverse portals: Al-Jazeera, Guardian and Fox News. Each editorial has 5 summaries, each labeled for overall quality and fine grained properties such as thesis-relevance, persuasiveness, reasonableness, self-containedness.The files are organized as follows:corpus.csv - Contains all the editorials and their acquired summariesNote: (X = [1,5] for five summaries)- article_id : Article ID in the corpus- title : Title of the editorial- article_text : Plain text of the editorial- summary_{X}_text : Plain text of the corresponding summary- thesis_{X}_text : Plain text of the thesis from the corresponding summary- lead : top 15% of the editorial's segments- body : segments between lead and conclusion sections- conclusion : bottom 15% of the editorial's segments- article_segments: Collection of paragraphs, each further divided into collection of segments containing: { "number": segment order in the editorial, "text" : segment text, "label": ADU type }- summary_{X}_segments: Collection of summary segments containing:{ "number": segment order in the editorial, "text" : segment text, "adu_label": ADU type from the editorial, "summary_label": can be 'thesis' or 'justification'}quality-groups.csv - Contains the IDs for high(and low)-quality summaries for each quality dimension per editorialFor example: article_id 2 has four high_quality summaries (summary_1, summary_2, summary_3, summary_4) and one low_quality summary (summary_5) in terms of overall quality.The summary texts can be obtained from corpus.csv respectively.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Webis-Editorials-16 corpus is a novel corpus with 300 news editorials evenly selected from three diverse online news portals: Al Jazeera, Fox News, and The Guardian. The aim of the corpus is to study (1) the mining and classification of fine-grained types of argumentative discourse units and (2) the analysis of argumentation strategies pursued in editorials to achieve persuasion. To this end, each editorial contains manual type annotations of all units that capture the role that a unit plays in the argumentative discourse, such as assumption or statistics. The corpus consists of 14,313 units of six different types, each annotated by three professional annotators from the crowdsourcing platform upwork.com.