2 datasets found
  1. W

    PAN-PC-09

    • webis.de
    3250083
    Updated 2009
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martin Potthast; Benno Stein; Andreas Eiselt (2009). PAN-PC-09 [Dataset]. http://doi.org/10.5281/zenodo.3250083
    Explore at:
    3250083Available download formats
    Dataset updated
    2009
    Dataset provided by
    The Web Technology & Information Systems Network
    Bauhaus-Universität Weimar
    Leipzig University
    Authors
    Martin Potthast; Benno Stein; Andreas Eiselt
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This corpus is outdated. Please use its successor PAN-PC-11.

  2. PAN Plagiarism Corpus 2009 (PAN-PC-09)

    • zenodo.org
    • commons.datacite.org
    bin
    Updated Jan 24, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martin Potthast; Martin Potthast; Benno Stein; Benno Stein; Andreas Eiselt; Alberto Barrón-Cedeño; Paolo Rosso; Andreas Eiselt; Alberto Barrón-Cedeño; Paolo Rosso (2020). PAN Plagiarism Corpus 2009 (PAN-PC-09) [Dataset]. http://doi.org/10.5281/zenodo.3250083
    Explore at:
    binAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Martin Potthast; Martin Potthast; Benno Stein; Benno Stein; Andreas Eiselt; Alberto Barrón-Cedeño; Paolo Rosso; Andreas Eiselt; Alberto Barrón-Cedeño; Paolo Rosso
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This corpus is outdated. Please use its successor PAN-PC-11: https://doi.org/10.5281/zenodo.3250095

    The PAN plagiarism corpus 2009 (PAN-PC-09) is a corpus for the evaluation of automatic plagiarism detection algorithms. For research purposes the corpus can be used free of charge.

    The PAN-PC-09 contains documents in which artificial plagiarism has been inserted automatically. The plagiarism cases have been constructed using a so-called random plagiarist, a computer program which constructs plagiarism according to a number of random variables. The variables include the percentage of plagiarism in the whole corpus, the percentage of plagiarism per document, the length of a single plagiarized section, and the degree of obfuscation per plagiarized section.

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Martin Potthast; Benno Stein; Andreas Eiselt (2009). PAN-PC-09 [Dataset]. http://doi.org/10.5281/zenodo.3250083

PAN-PC-09

Explore at:
72 scholarly articles cite this dataset (View in Google Scholar)
3250083Available download formats
Dataset updated
2009
Dataset provided by
The Web Technology & Information Systems Network
Bauhaus-Universität Weimar
Leipzig University
Authors
Martin Potthast; Benno Stein; Andreas Eiselt
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This corpus is outdated. Please use its successor PAN-PC-11.

Search
Clear search
Close search
Google apps
Main menu