1 dataset found
  1. PAN14 Originality: Source Retrieval

    • zenodo.org
    Updated Apr 2, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martin Potthast; Martin Potthast; Matthias Hagen; Matthias Hagen; Anne Beyer; Matthias Busse; Martin Tippmann; Paolo Rosso; Benno Stein; Benno Stein; Anne Beyer; Matthias Busse; Martin Tippmann; Paolo Rosso (2020). PAN14 Originality: Source Retrieval [Dataset]. http://doi.org/10.5281/zenodo.3716010
    Explore at:
    Dataset updated
    Apr 2, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Martin Potthast; Martin Potthast; Matthias Hagen; Matthias Hagen; Anne Beyer; Matthias Busse; Martin Tippmann; Paolo Rosso; Benno Stein; Benno Stein; Anne Beyer; Matthias Busse; Martin Tippmann; Paolo Rosso
    Description

    We provide you with a training corpus that consists of suspicious documents. Each suspicious document is about a specific topic and may consist of plagiarized passages obtained from web pages on that topic found in the ClueWeb09 corpus.

  2. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Martin Potthast; Martin Potthast; Matthias Hagen; Matthias Hagen; Anne Beyer; Matthias Busse; Martin Tippmann; Paolo Rosso; Benno Stein; Benno Stein; Anne Beyer; Matthias Busse; Martin Tippmann; Paolo Rosso (2020). PAN14 Originality: Source Retrieval [Dataset]. http://doi.org/10.5281/zenodo.3716010
Organization logo

PAN14 Originality: Source Retrieval

Explore at:
Dataset updated
Apr 2, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Martin Potthast; Martin Potthast; Matthias Hagen; Matthias Hagen; Anne Beyer; Matthias Busse; Martin Tippmann; Paolo Rosso; Benno Stein; Benno Stein; Anne Beyer; Matthias Busse; Martin Tippmann; Paolo Rosso
Description

We provide you with a training corpus that consists of suspicious documents. Each suspicious document is about a specific topic and may consist of plagiarized passages obtained from web pages on that topic found in the ClueWeb09 corpus.

Search
Clear search
Close search
Google apps
Main menu