1 dataset found
  1. PAN19 Authorship Analysis: Bots and Gender Profiling

    • zenodo.org
    Updated Apr 26, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Francisco Rangel; Paolo Rosso; Francisco Rangel; Paolo Rosso (2020). PAN19 Authorship Analysis: Bots and Gender Profiling [Dataset]. http://doi.org/10.5281/zenodo.3530208
    Explore at:
    Dataset updated
    Apr 26, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Francisco Rangel; Paolo Rosso; Francisco Rangel; Paolo Rosso
    Description

    Social media bots pose as humans to influence users with commercial, political or ideological purposes. For example, bots could artificially inflate the popularity of a product by promoting it and/or writing positive ratings, as well as undermine the reputation of competitive products through negative valuations. The threat is even greater when the purpose is political or ideological (see Brexit referendum or US Presidential elections). Fearing the effect of this influence, the German political parties have rejected the use of bots in their electoral campaign for the general elections. Furthermore, bots are commonly related to fake news spreading. Therefore, to approach the identification of bots from an author profiling perspective is of high importance from the point of view of marketing, forensics and security.

    After having addressed several aspects of author profiling in social media from 2013 to 2018 (age and gender, also together with personality, gender and language variety, and gender from a multimodality perspective), this year we aim at investigating whether the author of a Twitter feed is a bot or a human. Furthermore, in case of human, to profile the gender of the author.

    The uncompressed dataset consists in a folder per language (en, es). Each folder contains:

    • A XML file per author (Twitter user) with 100 tweets. The name of the XML file correspond to the unique author id.
    • A truth.txt file with the list of authors and the ground truth.
  2. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Francisco Rangel; Paolo Rosso; Francisco Rangel; Paolo Rosso (2020). PAN19 Authorship Analysis: Bots and Gender Profiling [Dataset]. http://doi.org/10.5281/zenodo.3530208
Organization logo

PAN19 Authorship Analysis: Bots and Gender Profiling

Explore at:
Dataset updated
Apr 26, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Francisco Rangel; Paolo Rosso; Francisco Rangel; Paolo Rosso
Description

Social media bots pose as humans to influence users with commercial, political or ideological purposes. For example, bots could artificially inflate the popularity of a product by promoting it and/or writing positive ratings, as well as undermine the reputation of competitive products through negative valuations. The threat is even greater when the purpose is political or ideological (see Brexit referendum or US Presidential elections). Fearing the effect of this influence, the German political parties have rejected the use of bots in their electoral campaign for the general elections. Furthermore, bots are commonly related to fake news spreading. Therefore, to approach the identification of bots from an author profiling perspective is of high importance from the point of view of marketing, forensics and security.

After having addressed several aspects of author profiling in social media from 2013 to 2018 (age and gender, also together with personality, gender and language variety, and gender from a multimodality perspective), this year we aim at investigating whether the author of a Twitter feed is a bot or a human. Furthermore, in case of human, to profile the gender of the author.

The uncompressed dataset consists in a folder per language (en, es). Each folder contains:

  • A XML file per author (Twitter user) with 100 tweets. The name of the XML file correspond to the unique author id.
  • A truth.txt file with the list of authors and the ground truth.
Search
Clear search
Close search
Google apps
Main menu