1 dataset found

Z
Cheema15 Originality: Text Alignment
data.niaid.nih.gov
Updated Apr 21, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Najib, Fahad (2020). Cheema15 Originality: Text Alignment [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_3712710
Explore at:
Dataset updated
Apr 21, 2020
Dataset provided by
Husnain Bukhari, Syed
Najib, Fahad
Muhammad Adeel Nawab, Rao
Sittar, Abdul
Arshad Cheema, Waqas
Ahmed, Shakil
Description
We provide you with a training corpus that consists of pairs of documents, one of which may contain passages of text reused from the other. The reused text is subject to various kinds of (automatic) obfuscation to hide the fact it has been reused. Enclosed in the evaluation corpora, a file named pairs is found, which lists all pairs of suspicious documents and source documents.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Najib, Fahad (2020). Cheema15 Originality: Text Alignment [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_3712710

Cheema15 Originality: Text Alignment

Explore at:

Dataset updated

Apr 21, 2020

Dataset provided by

Husnain Bukhari, Syed
Najib, Fahad
Muhammad Adeel Nawab, Rao
Sittar, Abdul
Arshad Cheema, Waqas
Ahmed, Shakil

Description

We provide you with a training corpus that consists of pairs of documents, one of which may contain passages of text reused from the other. The reused text is subject to various kinds of (automatic) obfuscation to hide the fact it has been reused. Enclosed in the evaluation corpora, a file named pairs is found, which lists all pairs of suspicious documents and source documents.