Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Webis Clickbait Spoiling Corpus 2022
The Webis Clickbait Spoiling Corpus 2022 (Webis-Clickbait-22) contains 5,000 spoiled clickbait posts crawled from Facebook, Reddit, and Twitter.
This corpus supports the task of clickbait spoiling, which deals with generating a short text that satisfies the curiosity induced by a clickbait post.
This dataset contains the clickbait posts and manually cleaned versions of the linked documents, and extracted spoilers for each clickbait post.
Additionally, the spoilers are categorized into three types: short phrase spoilers, longer passage spoilers, and multiple non-consecutive pieces of text.
This dataset contains the clickbait posts and manually cleaned versions of the linked documents, and extracted spoilers for each clickbait post.
Additionally, the spoilers are categorized into three types: short phrase spoilers, longer passage spoilers, and multiple non-consecutive pieces of text. The test set of this dataset was used for the SemEval-2023 clickbait spoiling task. You can re-execute and adopt the software submissions made through for this SemEval task, please see the instructions and overview of approaches in TIRA.
Overview
The dataset comes with predefined train/validation/test splits:
The test set was used for the SemEval-2023 clickbait spoiling task. This shared task was organized with TIRA.io and participants submitted Docker software during the task. Please see the instructions in TIRA to re-execute or modify the approaches.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Webis Clickbait Spoiling Corpus 2022
The Webis Clickbait Spoiling Corpus 2022 (Webis-Clickbait-22) contains 5,000 spoiled clickbait posts crawled from Facebook, Reddit, and Twitter.
This corpus supports the task of clickbait spoiling, which deals with generating a short text that satisfies the curiosity induced by a clickbait post.
This dataset contains the clickbait posts and manually cleaned versions of the linked documents, and extracted spoilers for each clickbait post.
Additionally, the spoilers are categorized into three types: short phrase spoilers, longer passage spoilers, and multiple non-consecutive pieces of text.
This dataset contains the clickbait posts and manually cleaned versions of the linked documents, and extracted spoilers for each clickbait post.
Additionally, the spoilers are categorized into three types: short phrase spoilers, longer passage spoilers, and multiple non-consecutive pieces of text. The test set of this dataset was used for the SemEval-2023 clickbait spoiling task. You can re-execute and adopt the software submissions made through for this SemEval task, please see the instructions and overview of approaches in TIRA.
Overview
The dataset comes with predefined train/validation/test splits:
The test set was used for the SemEval-2023 clickbait spoiling task. This shared task was organized with TIRA.io and participants submitted Docker software during the task. Please see the instructions in TIRA to re-execute or modify the approaches.