3 datasets found

W
PAN-SemEval-Hyperpartisan-News-Detection-19
webis.de
anthology.aicmu.ac.cn
1489920
Updated 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Johannes Kiesel; Martin Potthast; Payam Adineh; Benno Stein (2018). PAN-SemEval-Hyperpartisan-News-Detection-19 [Dataset]. http://doi.org/10.5281/zenodo.1489920
Explore at:
1489920Available download formats
Unique identifier
https://doi.org/10.5281/zenodo.1489920
Dataset updated
2018
Dataset provided by
The Web Technology & Information Systems Network
GESIS - Leibniz Institute for the Social Sciences
Bauhaus-Universität Weimar
University of Kassel, hessian.AI, and ScaDS.AI
Authors
Johannes Kiesel; Martin Potthast; Payam Adineh; Benno Stein
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The PAN-SemEval-Hyperpartisan-News-Detection-19 has been developed in cooperation with Factmata for the PAN @ SemEval 2019 Task on Hyperpartisan news detection. More information on the task can be found on the task's web site.
P
Hyperpartisan News Detection Dataset
paperswithcode.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hyperpartisan News Detection Dataset [Dataset]. https://paperswithcode.com/dataset/hyperpartisan
Explore at:
Description
Hyperpartisan News Detection was a dataset created for PAN @ SemEval 2019 Task 4. Given a news article text, decide whether it follows a hyperpartisan argumentation, i.e., whether it exhibits blind, prejudiced, or unreasoning allegiance to one party, faction, cause, or person.

There are two parts:

byarticle: Labeled through crowdsourcing on an article basis. The data contains only articles for which a consensus among the crowdsourcing workers existed. bypublisher: Labeled by the overall bias of the publisher as provided by BuzzFeed journalists or MediaBiasFactCheck.com.
Data from: Data for PAN at SemEval 2019 Task 4: Hyperpartisan News Detection...
zenodo.org
bin, zip
Updated Dec 13, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Johannes Kiesel; Johannes Kiesel; Maria Mestre; Rishabh Shukla; Emmanuel Vincent; David Corney; Payam Adineh; Benno Stein; Benno Stein; Martin Potthast; Martin Potthast; Maria Mestre; Rishabh Shukla; Emmanuel Vincent; David Corney; Payam Adineh (2021). Data for PAN at SemEval 2019 Task 4: Hyperpartisan News Detection [Dataset]. http://doi.org/10.5281/zenodo.1489920
Explore at:
zip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.1489920
Dataset updated
Dec 13, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Johannes Kiesel; Johannes Kiesel; Maria Mestre; Rishabh Shukla; Emmanuel Vincent; David Corney; Payam Adineh; Benno Stein; Benno Stein; Martin Potthast; Martin Potthast; Maria Mestre; Rishabh Shukla; Emmanuel Vincent; David Corney; Payam Adineh
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Training and validation data for the PAN @ SemEval 2019 Task 4: Hyperpartisan News Detection.

The data is split into multiple files. The articles are contained in the files with names starting with "articles-" (which validate against the XML schema article.xsd). The ground-truth information is contained in the files with names starting with "ground-truth-" (which validate against the XML schema ground-truth.xsd).

The first part of the data (filename contains "bypublisher") is labeled by the overall bias of the publisher as provided by BuzzFeed journalists or MediaBiasFactCheck.com. It contains a total of 750,000 articles, half of which (375,000) are hyperpartisan and half of which are not. Half of the articles that are hyperpartisan (187,500) are on the left side of the political spectrum, half are on the right side. This data is split into a training set (80%, 600,000 articles) and a validation set (20%, 150,000 articles), where no publisher that occurs in the training set also occurs in the validation set. Similarly, none of the publishers in those sets will occur in the test set.

The second part of the data (filename contains "byarticle") is labeled through crowdsourcing on an article basis. The data contains only articles for which a consensus among the crowdsourcing workers existed. It contains a total of 645 articles. Of these, 238 (37%) are hyperpartisan and 407 (63%) are not, We will use a similar (but balanced!) test set. Again, none of the publishers in this set will occur in the test set.

Note that article IDs are only unique within the parts.

The collection (including labels) are licensed under a Creative Commons Attribution 4.0 International License.

Acknowledgements: Thanks to Jonathan Miller for his assistance in cleaning the data!
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Johannes Kiesel; Martin Potthast; Payam Adineh; Benno Stein (2018). PAN-SemEval-Hyperpartisan-News-Detection-19 [Dataset]. http://doi.org/10.5281/zenodo.1489920

PAN-SemEval-Hyperpartisan-News-Detection-19

Explore at:

7 scholarly articles cite this dataset (View in Google Scholar)

1489920Available download formats

Unique identifier

https://doi.org/10.5281/zenodo.1489920

Dataset updated

2018

Dataset provided by

The Web Technology & Information Systems Network
GESIS - Leibniz Institute for the Social Sciences
Bauhaus-Universität Weimar
University of Kassel, hessian.AI, and ScaDS.AI

Authors

Johannes Kiesel; Martin Potthast; Payam Adineh; Benno Stein

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The PAN-SemEval-Hyperpartisan-News-Detection-19 has been developed in cooperation with Factmata for the PAN @ SemEval 2019 Task on Hyperpartisan news detection. More information on the task can be found on the task's web site.

PAN-SemEval-Hyperpartisan-News-Detection-19

Hyperpartisan News Detection Dataset

Data from: Data for PAN at SemEval 2019 Task 4: Hyperpartisan News Detection...

PAN-SemEval-Hyperpartisan-News-Detection-19