100+ datasets found
  1. P

    Linux Dataset

    • paperswithcode.com
    Updated Feb 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiaoli Wang; Xiaofeng Ding; Anthony K. H. Tung; Shanshan Ying; Hai Jin (2024). Linux Dataset [Dataset]. https://paperswithcode.com/dataset/linux
    Explore at:
    Dataset updated
    Feb 7, 2024
    Authors
    Xiaoli Wang; Xiaofeng Ding; Anthony K. H. Tung; Shanshan Ying; Hai Jin
    Description

    The LINUX dataset consists of 48,747 Program Dependence Graphs (PDG) generated from the Linux kernel. Each graph represents a function, where a node represents one statement and an edge represents the dependency between the two statements

  2. E

    Linux Statistics 2024 By Market Share, Usage Data, Number Of Users and Facts...

    • enterpriseappstoday.com
    Updated Mar 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    EnterpriseAppsToday (2024). Linux Statistics 2024 By Market Share, Usage Data, Number Of Users and Facts [Dataset]. https://www.enterpriseappstoday.com/stats/linux-statistics.html
    Explore at:
    Dataset updated
    Mar 5, 2024
    Dataset authored and provided by
    EnterpriseAppsToday
    License

    https://www.enterpriseappstoday.com/privacy-policyhttps://www.enterpriseappstoday.com/privacy-policy

    Time period covered
    2022 - 2032
    Area covered
    Global
    Description

    Editor’s Choice

    • Pro developers are enthusiastic about using Linux operating systems and are like 47%. (Statista)
    • Its capabilities account for 39.2% of websites whose operating systems are known. (W3Techs)
    • 85% of smartphones are powered by Linux (Hayden James).
    • Its marketing share is 2.09 %. is, without doubt, third in Statista's top ten most-used desktop operating systems.
    • By 2027, the global Linux market will reach $15.64 trillion. (Fortune Business Insights).
    • Linux is the operating system of all the world's fastest supercomputers. (Blackdown)
    • 96.3% of The top 1,000,000 web servers use Linux. (ZDNet)
    • Active Linus distros are still available today(Tecmint).
    • In 2022, Linux claimed a 34% market share in the container orchestration market, reflecting its versatility.
    • Linux holds approximately 3% of the desktop operating system market share according to StatCounter.
    • The Linux kernel, the core of the operating system, consists of more than 80 million lines of code, showcasing its complexity and robustness.
    • Linux is the preferred choice for web hosting, powering more than 95% of the top 1 million websites.
    • It enjoys ubiquity in the supercomputing realm, running on 100% of the world's supercomputers.
    • In 2022, Linux server revenue reached an impressive USD 13.4 billion, indicating its economic significance.
    • Linux offers a diverse landscape with over 500 different distributions catering to various user needs.
    • The cloud computing landscape heavily relies on Linux, with over 90% of public cloud workloads being Linux-based.
    • Despite its desktop market share, Linux accounts for approximately 36.7% of all operating systems on desktop computers.
    • Android, built on the Linux kernel, dominates the global smartphone market with a staggering 70% market share.
    • The Linux server market is expected to continue growing with an annual growth rate of 8.6%.
    • Linux's prowess extends to supercomputing, as over 92% of the world's top 500 fastest supercomputers run on Linux.
    • The global community of active Linux users exceeds a remarkable 100 million.
    • Android, a Linux-based operating system, sees over 1.5 billion smartphone shipments annually.
    • The embedded Linux market boasts an estimated size of USD 5.3 billion, signifying its role in various industries.
    • Linux is the platform of choice for 68% of IoT devices and systems, indicating its reliability.
    • With over 25,000 contributors, the Linux kernel enjoys robust development and continuous improvement.
    • The enterprise Linux market is projected to reach USB 14.4 billion by 2025, driven by business adoption.
    • Linux's reputation for security is strong, being considered 10 times more secure than some other operating systems.
    • Linux plays a critical role in the world's stock exchanges, powering more than 75% of them.
    • Its open-source nature leads to an average of 10,000 lines of code added daily, demonstrating its active development.
    • Linux remains the primary operating system for 70% of web servers globally.
    • Job postings related to Linux have surged by 31% in the last year, highlighting the high demand for Linux professionals.

    You May Also Like To Read

    Key Linux Statistics

    • Market Share and Usage: Linux holds a 2.09% share of the desktop operating system market, making it the third most popular after Windows and macOS. In the server domain, Linux accounted for 13.6% globally in 2019, showcasing significant usage in infrastructure.
    • Market Size: The global Linux market is expected to reach $15.64 billion by 2027, with a compound annual growth rate (CAGR) of 19.2%. This growth is driven by the increasing number of servers, rising internet penetration rates, and the expansion of data centers.
    • Cloud Infrastructure: Linux dominates cloud computing, powering over 90% of cloud infrastructure. It is the backbone for major public cloud providers like AWS, Google Cloud Platform, and Microsoft Azure due to its scalability, security, and cost-effectiveness.
    • Mobile Devices: Linux significantly impacts the mobile device market, with Android, the most widely used mobile operating system, based on the Linux kernel. Android's open-source nature allows for extensive customization.
    • Security: Known for its robust security features, Linux is a popular choice for organizations and individuals concerned about data protection. Its kernel includes various security features like access control lists (ACLs), mandatory access control (MAC), and secure computing mode (seccomp).
    • IoT and Smart Devices: In the Internet of Things (IoT) ecosystem, Linux powers a wide range of devices, from smart home devices to industrial IoT applications, thanks to its stability and flexibility.
    • Developer Preference: Linux is preferred by 47% of professional developers, highlighting its flexibility and the availability of compilers or interpreters not found in Windows.
    • Web Servers: Linux powers 96.3% of the top one million web servers, indicating its widespread adoption in internet infrastructure.
    • Linux Distributions: Over 600 active Linux distributions are available, with Ubuntu being the most popular, accounting for 33.9% of the Linux market. This is followed by Debian and CentOS.
    • Future Prospects: The future of Linux includes promising growth in edge computing and the automotive industry, where its stability, security, and open-source nature are highly valued.
  3. s

    The Linux Command Line for Beginners - Dataset - STC Training Center

    • training.stcenter.net
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). The Linux Command Line for Beginners - Dataset - STC Training Center [Dataset]. https://training.stcenter.net/dataset/the-linux-command-line-for-beginners
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    This guide is an overview of Linux commands for making directories, moving files, listing files (ls), changing directories, and so forth. It is publicly available through the ubuntu website and is considered a general training material.

  4. h

    linux-man-pages-tldr-summarized

    • huggingface.co
    Updated Sep 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tamas Kiss (2023). linux-man-pages-tldr-summarized [Dataset]. https://huggingface.co/datasets/tmskss/linux-man-pages-tldr-summarized
    Explore at:
    Dataset updated
    Sep 19, 2023
    Authors
    Tamas Kiss
    Description

    Dataset Card for linux-man-pages-tldr-summarized

      Dataset Summary
    

    This dataset contains linux man pages downloaded from man7, with a prefix: 'summarize: ', and the corresponding summarization downloaded from TLDR-pages.

      Supported Tasks
    

    This dataset should be used to fine-tune language models for summarization tasks.

  5. b

    Linux Hint BD Events Database

    • linuxhintbd.blogspot.com
    csv, xml
    Updated Nov 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    LLC/Linux/Hint/BD > National Centers for Environmental Information, Linux Hint BD, NOAA, B.D. Department of IT (2022). Linux Hint BD Events Database [Dataset]. https://linuxhintbd.blogspot.com/
    Explore at:
    xml, csvAvailable download formats
    Dataset updated
    Nov 3, 2022
    Dataset authored and provided by
    LLC/Linux/Hint/BD > National Centers for Environmental Information, Linux Hint BD, NOAA, B.D. Department of IT
    License

    https://www.linuxhintbd.xyzhttps://www.linuxhintbd.xyz

    Time period covered
    Jan 1, 1950 - Dec 18, 2013
    Area covered
    Pacific Ocean, North Pacific Ocean
    Dataset funded by
    Linux Server Support & IT Support Service
    Description

    Linux Hint BD Storm Data is provided by the National Weather Service (NWS) and contain statistics on...

  6. Linux Kernel Git Revision History

    • kaggle.com
    zip
    Updated Mar 11, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Philipp Schmidt (2017). Linux Kernel Git Revision History [Dataset]. https://www.kaggle.com/datasets/philschmidt/linux-kernel-git-revision-history
    Explore at:
    zip(45917448 bytes)Available download formats
    Dataset updated
    Mar 11, 2017
    Authors
    Philipp Schmidt
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    This dataset contains commits with detailed information about changed files from about 12 years of the linux kernel master branch. It contains about 600.000 (filtered) commits and this breaks down to about 1.4 million file change records.

    Each row represents a changed file in a specific commit, with annotated deletions and additions to that file, as well as the filename and the subject of the commit. I also included anonymized information about the author of each changed file aswell as the time of commit and the timezone of the author.

    The columns in detail:

    • author_timestamp: UNIX timestamp of when the commit happened
    • commit_hash: SHA-1 hash of the commit
    • commit_utc_offset_hours: Extraced UTC offset in hours from commit time
    • filename: The filename that was changed in the commit
    • n_additions: Number of added lines
    • n_deletions: Number of deleted lines
    • subject: Subject of commit
    • author_id: Anonymized author ID.

    I'm sure with this dataset nice visualizations can be created, let's see what we can come up with!

    For everybody interested how the dataset was created, I've setup a github repo that contains all the required steps to reproduce it here.

    If you have any questions, feel free to contact me via PM or discussions here.

  7. Linux Kernel binary size

    • zenodo.org
    csv, json
    Updated Jun 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hugo MARTIN; Hugo MARTIN; Mathieu ACHER; Mathieu ACHER (2021). Linux Kernel binary size [Dataset]. http://doi.org/10.5281/zenodo.4943884
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Jun 14, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Hugo MARTIN; Hugo MARTIN; Mathieu ACHER; Mathieu ACHER
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset containing measurements of Linux Kernel binary size after compilation. The reported size, in the column "perf", is the size in bytes of the vmlinux file. In contains also a column "active_options" reporting the number of activated options (set at "y"). All other columns, the list being reported in the file "Linux_options.json", are Linux kernel options. The sampling have been made using randconfig. The version of Linux used is 4.13.3.

    Not all available options are present. First, it only contains options about the x86 and 64 bits version. Then, all non-tristate options have been ignored. Finally, options not having multiple value through the whole dataset, due to not enough variability in the sampling, are ignored. All options are encoded as 0 for "n" and "m" options value, and 1 for "y".

    In python, importing the dataset using pandas will attribute all columns to int64, which will lead to a great consumption of memory (~50GB). We provide this way to import it using less than 1 GB of memory by setting options columns to int8.

    import pandas as pd
    import json
    import numpy
    
    with open("Linux_options.json","r") as f:
      linux_options = json.load(f)
    # Load csv by setting options as int8 to save a lot of memory
    return pd.read_csv("Linux.csv", dtype={f:numpy.int8 for f in linux_options})

  8. w

    Linux

    • workwithdata.com
    Updated Jan 10, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2022). Linux [Dataset]. https://www.workwithdata.com/book/Linux_67355
    Explore at:
    Dataset updated
    Jan 10, 2022
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Explore Linux through unique data from multiples sources: key facts, real-time news, interactive charts, detailed maps & open datasets

  9. Build and measurements of Linux kernel configurations across different...

    • zenodo.org
    bin
    Updated Dec 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mathieu Acher; Mathieu Acher; Juliana Alves Pereira; Hugo Martin; Luc Lesoil; Jean-Marc Jézéquel; Djamel Eddine Khelladi; Juliana Alves Pereira; Hugo Martin; Luc Lesoil; Jean-Marc Jézéquel; Djamel Eddine Khelladi (2022). Build and measurements of Linux kernel configurations across different versions [Dataset]. http://doi.org/10.5281/zenodo.7433623
    Explore at:
    binAvailable download formats
    Dataset updated
    Dec 14, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Mathieu Acher; Mathieu Acher; Juliana Alves Pereira; Hugo Martin; Luc Lesoil; Jean-Marc Jézéquel; Djamel Eddine Khelladi; Juliana Alves Pereira; Hugo Martin; Luc Lesoil; Jean-Marc Jézéquel; Djamel Eddine Khelladi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    With large scale and complex configurable systems, it is hard for users to choose the right combination of options (i.e., configurations) in order to obtain the wanted trade-off between functionality and performance goals such as speed or size. Machine learning can help in relating these goals to the configurable system options, and thus, predict the effect of options on the outcome, typically after a costly training step. However, many configurable systems evolve at such a rapid pace that it is impractical to retrain a new model from scratch for each new version. Taking the extreme case of the Linux kernel with its ≈ 14, 500 configuration options, we investigate how binary size predictions of kernel size degrade over successive versions (and how transfer learning can be adapted and applied to mitigate this degradation).

    We used and are sharing a unique and large dataset constituted of the binary sizes (compressed and non-compressed) of thousands of configurations for different versions of the kernel, spanning three years (4.13, 4.15, 4.20, 5.0, 5.4, 5.7, and 5.8). Overall, around 200K configurations over 10K+ options/features and 6 versions.

    This dataset has been used in the Transactions of Software Engineering (TSE) article "Transfer Learning Across Variants and Versions: The Case of Linux Kernel Size" (preprint: https://hal.inria.fr/hal-03358817)

  10. k

    Linux-Logs

    • kaggle.com
    Updated Feb 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Linux-Logs [Dataset]. https://www.kaggle.com/datasets/ggsri123/linux-logs
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 23, 2023
    Description

    Linux Logs

    About the Dataset

    This data contains 2k log lines from the Linux Dataset, derived from LogPai Github Repository. The first file contains just the log lines. Second, contains the log lines with their categorized fields - namely Month, Date, Time, Level, Component, PID, Content, EventID, and EventTemplate.
    

    Interesting Task Ideas

    1. Understanding the frequency of different Event Types (EventID) that occur in the log set.
    2. Identifying anomaly in the logs, if it exists.
    3. Named Entity Recognition - To identify different fields of the log set from the set-aside data.
    4. Multiclass classification - To identify what Event Type (EventID) the log line belongs to. 
    5. Adding variable parts (<*>) a name, and adding it to the entity recognition task. [Boss level!]
    

    Point 5 explanation: In the 3rd file named Linux_2k.log_templates.csv, for each of the event types (given by EventIDs) there is a template. The template consists of a variable portion (given by <*>) and a constant portion (the other words in the template). The value of this variable part can be found by comparing the template against the log line containing this template. A name could be assigned to the variable part and be accounted for named entity recognition. Keep in mind the frequency of a variable part might be limited.

    Note: An important idea to have in mind is that one will have to focus on the syntax more than the semantics of a log line.

    Have fun understanding how to apply NLP concepts to Log Datasets! 😀

    Check out my other Datasets here

    MIT License
    
    Copyright (c) 2018 LogPAI
    
    Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
    
    The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
    
  11. h

    stackoverflow_linux

    • huggingface.co
    Updated Oct 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Konrad Szafer (2023). stackoverflow_linux [Dataset]. https://huggingface.co/datasets/KonradSzafer/stackoverflow_linux
    Explore at:
    Dataset updated
    Oct 6, 2023
    Authors
    Konrad Szafer
    Description

    Dataset Card for "stackoverflow_linux"

    Dataset information:

    Source: Stack Overflow Category: Linux Number of samples: 300 Train/Test split: 270/30 Quality: Data come from the top 1k most upvoted questions

      Additional Information
    
    
    
    
    
    
    
      License
    

    All Stack Overflow user contributions are licensed under CC-BY-SA 3.0 with attribution required. More Information needed

  12. Data from: A Study of Feature Scattering in the Linux Kernel

    • ieee-dataport.org
    Updated Dec 12, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Leonardo Passos (2018). A Study of Feature Scattering in the Linux Kernel [Dataset]. http://doi.org/10.21227/aswj-q655
    Explore at:
    Dataset updated
    Dec 12, 2018
    Dataset provided by
    Institute of Electrical and Electronics Engineershttp://www.ieee.ro/
    Authors
    Leonardo Passos
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Feature code is often scattered across a software system. Scattering is not necessarily bad if used with care, as witnessed by systems with highly scattered features that evolved successfully. Feature scattering, often realized with a pre-processor, circumvents limitations of programming languages and software architectures. Unfortunately, little is known about the principles governing scattering in large and long-living software systems. We present a longitudinal study of feature scattering in the Linux kernel, complemented by a survey with 74, and interviews with nine Linux kernel developers. We analyzed almost eight years of the kernel's history, focusing on its largest subsystem: device drivers. We learned that the ratio of scattered features remained nearly constant and that most features were introduced without scattering. Yet, scattering easily crosses subsystem boundaries, and highly scattered outliers exist. Scattering often addresses a performance-maintenance tradeoff (alleviating complicated APIs), hardware design limitations, and avoids code duplication. While developers do not consciously enforce scattering limits, they actually improve the system design and refactor code, thereby mitigating pre-processor idiosyncrasies or reducing its use.

  13. f

    Linux Kernel vulnerability dataset

    • figshare.com
    application/gzip
    Updated Feb 12, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maciek Makowski (2019). Linux Kernel vulnerability dataset [Dataset]. http://doi.org/10.6084/m9.figshare.7710176.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Feb 12, 2019
    Dataset provided by
    figshare
    Authors
    Maciek Makowski
    License

    https://www.gnu.org/copyleft/gpl.htmlhttps://www.gnu.org/copyleft/gpl.html

    Description

    File versions from Linux Kernel, with vulnerability labels derived from the CVE database. Based on the work by Jimenez et al. [1].[1] Jimenez, Matthieu, Mike Papadakis, and Yves Le Traon. "Vulnerability prediction models: A case study on the linux kernel." Source Code Analysis and Manipulation (SCAM), 2016 IEEE 16th International Working Conference on. IEEE, 2016.

  14. w

    Data from: A practical guide to Linux

    • workwithdata.com
    Updated May 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2023). A practical guide to Linux [Dataset]. https://www.workwithdata.com/book/A%20practical%20guide%20to%20Linux_11269
    Explore at:
    Dataset updated
    May 18, 2023
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Explore A practical guide to Linux through unique data from multiples sources: key facts, real-time news, interactive charts, detailed maps & open datasets

  15. Games for Linux

    • kaggle.com
    Updated Nov 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Games for Linux [Dataset]. https://www.kaggle.com/datasets/thedevastator/list-of-fun-games-to-play-on-linux
    Explore at:
    Dataset updated
    Nov 10, 2022
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Games for Linux

    A Dataset of games released for the Linux OS

    About this dataset

    This dataset includes a list of games that have been released on the Linux operating system. Games that are natively playable on Linux are included in this list.

    With so many amazing games available to play on Linux, it can be hard to decide which ones to try first. This comprehensive list includes something for everyone, from fast-paced action games to relaxing puzzle games and everything in between. With such a wide variety of genres represented, there's sure to be something here that appeals to you.

    So what are you waiting for? Give one (or more) of these Linux games a try today!

    How to use the dataset

    This dataset can be used to find a list of games that have been released on the Linux operating system. Games that are natively playable on Linux are included in this list. The dataset includes the name of the game, the developer, the publisher, the genres, the operating systems, the date released, and the Metacritic score

    Research Ideas

    • Checking for release dates of Linux games.
    • Finding the Metacritic score for a particular game.
    • Searching for a specific game by name or genre

    Acknowledgements

    This dataset was compiled by scraping wikipedia

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: df_16.csv | Column name | Description | |:----------------------|:----------------------------------------------------------| | Name | The name of the game. (String) | | Developer | The game's developer. (String) | | Publisher | The game's publisher. (String) | | Genres | The genres the game belongs to. (String) | | Operating Systems | The operating systems the game can be played on. (String) | | Date Released | The date the game was released. (Date) | | Metacritic | The game's Metacritic score. (Integer) |

    File: df_26.csv | Column name | Description | |:----------------------|:----------------------------------------------------------| | Name | The name of the game. (String) | | Developer | The game's developer. (String) | | Publisher | The game's publisher. (String) | | Genres | The genres the game belongs to. (String) | | Operating Systems | The operating systems the game can be played on. (String) | | Date Released | The date the game was released. (Date) | | Metacritic | The game's Metacritic score. (Integer) |

    File: df_20.csv | Column name | Description | |:----------------------|:----------------------------------------------------------| | Name | The name of the game. (String) | | Developer | The game's developer. (String) | | Publisher | The game's publisher. (String) | | Genres | The genres the game belongs to. (String) | | Operating Systems | The operating systems the game can be played on. (String) | | Date Released | The date the game was released. (Date) | | Metacritic | The game's Metacritic score. (Integer) |

    File: df_18.csv | Column name | Description | |:----------------------|:----------------------------------------------------------| | Name | The name of the game. (String) | | Developer | The game's developer. (String) | | Publisher | The game's publisher. (String) | | Genres | The genres the game belongs to. (String) | | Operating Systems | The operating systems the game can be played on. (String) | | Date Released | The date the game was released. (Date) | | Metacritic | The game's Metacritic score. (Integer) |

    File: df_25.csv | Column name | Description | |:----------------------|:----------------------------------------------------------| | Name | The name of the game. (String) | | Developer | The game's developer. (String) | | Publisher | The game's publisher. (String) | | Genres | The genres the game belongs to. (String) | | Operating Systems | The operating systems the game can be played on. (String) | | Date Released | The date the game was released. (Date) | | Metacritic | The game's Metacritic score. (Integer) |

    File: df_27.csv | Column name | Description | |:----------------------|:----------------------------------------------------------| | Name | The name of the game. (String) | | Developer | The game's developer. (String) | | Publisher | The game's publisher. (String) | | Genres | The genres the game belongs to. (String) | | Operating Systems | The operating systems the game can be played on. (String) | | Date Released | The date the game was released. (Date) | | Metacritic | The game's Metacritic score. (Integer) |

    File: df_11.csv | Column name | Description | |:----------------------|:----------------------------------------------------------| | Name | The name of the game. (String) | | Developer | The game's developer. (String) | | Publisher | The game's publisher. (String) | | Genres | The genres the game belongs to. (String) | | Operating Systems | The operating systems the game can be played on. (String) | | Date Released | The date the game was released. (Date) | | Metacritic | The game's Metacritic score. (Integer) |

    File: df_1.csv | Column name | Description | |:--------------|:---------------------------| | 0 | Name of the game. (String) |

    File: df_31.csv

    File: df_4.csv | Column name | Description | |:----------------------|:----------------------------------------------------------| | Name | The name of the game. (String) | | Developer | The game's developer. (String) | | Publisher | The game's publisher. (String) | | Genres | The genres the game belongs to. (String) | | Operating Systems | The operating systems the game can be played on. (String) | | Date Released | The date the game was released. (Date) | | Metacritic | The game's Metacritic score. (Integer) |

    File: df_21.csv | Column name | Description | |:----------------------|:----------------------------------------------------------| | Name | The name of the game. (String) | | Developer | The game's developer. (String) | | Publisher | The game's publisher. (String) | | Genres | The genres the game belongs to. (String) | | Operating Systems | The operating systems the game can be played on. (String) | | Date Released | The date the game was released. (Date) | | Metacritic | The game's Metacritic score. (Integer) |

    File: df_17.csv | Column name | Description | |:----------------------|:----------------------------------------------------------| | Name | The name of the game. (String) | | Developer | The game's developer. (String) | | Publisher | The game's publisher. (String) | | Genres | The genres the game belongs to. (String) | | Operating Systems | The operating systems the game can be played on. (String) | | Date Released | The date the game was released. (Date) | | Metacritic | The game's Metacritic score. (Integer) |

    File: df_24.csv | Column name | Description

  16. w

    Data from: Linux and the Unix philosophy

    • workwithdata.com
    Updated Jan 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2022). Linux and the Unix philosophy [Dataset]. https://www.workwithdata.com/book/Linux%20and%20the%20Unix%20philosophy_132181
    Explore at:
    Dataset updated
    Jan 10, 2022
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Explore Linux and the Unix philosophy through unique data from multiples sources: key facts, real-time news, interactive charts, detailed maps & open datasets

  17. f

    FEVER - feature oriented history of the Linux kernel

    • figshare.com
    zip
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicolas Dintzner (2023). FEVER - feature oriented history of the Linux kernel [Dataset]. http://doi.org/10.4121/uuid:c478028a-ac6d-4c45-9e2d-8ad63c7ca75f
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    4TU.ResearchData
    Authors
    Nicolas Dintzner
    License

    https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use

    Description

    This dataset contains changes performed by developers over 15 releases of the Linux kernel. This dataset cover the feature-oriented change history of the kernel between releases 3.10 and 4.4. The changes are broken down by affected artefacts, and all changes pertaining to the same feature are regrouped together. If you want to know things like: How many time was a feature touched in the kernel? How many feature changes came with makefile adjustments ? Then this dataset may interest you. To access the data, you can install a Neo4j server via http://neo4j.com/

  18. i

    Computer use, by Autonomous community and knowledge of LINUX operating...

    • ine.es
    csv, html, json +4
    Updated Aug 21, 2014
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    INE - Instituto Nacional de Estadística (2014). Computer use, by Autonomous community and knowledge of LINUX operating system [Dataset]. https://www.ine.es/jaxi/tabla.do?path=/t25/p450/base_2011/a2007/l1/&file=08034.px&type=pcaxis&L=1
    Explore at:
    xls, html, json, txt, csv, xlsx, text/pc-axisAvailable download formats
    Dataset updated
    Aug 21, 2014
    Dataset authored and provided by
    INE - Instituto Nacional de Estadística
    License

    https://www.ine.es/aviso_legalhttps://www.ine.es/aviso_legal

    Variables measured
    Autonomous Communities, Use of knowledge of the LINUX operating system
    Description

    Survey on Equipment and Use of Information and Communication Technologies in Households: Computer use, by Autonomous community and knowledge of LINUX operating system. Autonomous Communities.

  19. Linux Kernel Commits-per-Month Until 2019.

    • kaggle.com
    zip
    Updated Mar 22, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joshua Isanan (2020). Linux Kernel Commits-per-Month Until 2019. [Dataset]. https://www.kaggle.com/joshuaisanan/linux-kernel-commitspermonth-until-2019
    Explore at:
    zip(2370 bytes)Available download formats
    Dataset updated
    Mar 22, 2020
    Authors
    Joshua Isanan
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Context

    I wanted to create a time-series forecasting model for tracking the number of commits on the Linux kernel repository per month. I extracted the data from https://www.phoronix.com/misc/linux-eoy2019/activity.html which already tracks the Linux repository using Gitstat. Having realized that the community might be able to create more robust models. I decided to upload the extracted data here as well. Link to repository: https://github.com/torvalds/linux

    Content

    The dataset contains the Linux Repository commit/lines added/lines deleted count that's been collected on a monthly basis. Phoronix's last extraction was performed by the end of 2019.

    Inspiration

    I would love to see how the community would create time-series models on a dataset that's as limited as this.

    Additional Notes

    Image Taken From: Donald Clark from Pixabay

  20. d

    Linux-APT-Dataset-2024 - Dataset - B2FIND

    • b2find.dkrz.de
    Updated Mar 5, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Linux-APT-Dataset-2024 - Dataset - B2FIND [Dataset]. https://b2find.dkrz.de/dataset/2592968d-3db1-5dbb-8226-50ea976d8872
    Explore at:
    Dataset updated
    Mar 5, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A novel dataset "Linux-APT-Dataset-2024” that includes the Tactics, Techniques and Procedure (TTPs) of APT attacks in Linux environment. There are two files one is 'combine.csv' and other is 'Processed Version.xlsx' both includes 17 files ranging from 01st October 2023 to 07 January 2024 and each of the file contains all the essential data fields. These 17 files can be found in the below link.Karim, S. (2024). Linux-APT-Dataset-2024 [Data set]. Zenodo. https://doi.org/10.5281/zenodo.106856421. combine.csv is the raw file that is a merger of all the 17 files extracted from SIEM 'WAZUH' after the simulation of latest attacks in the environment. Due to SIEM's 'WAZUH' limitation to produce files with more than 10,000 records, all of the files are combined that could be used as input for other analyses.2. Processed Version.xlsx is the compiled version of combine.csv, the file extension is changed to xlsx because of the support available in most of the system, also Tactics and Techniques are separated for convenience of different researchers. It is also tagged with General and Malicious, if the value is 1 means it is suspicious/malicious, otherwise 0 for General/Normal log.Regarding dataset, it contains both type of activities/logs general as well as malicious/suspicious to make the dataset near real-time for better analysis and evaluation. It will be more productive if the cybersecurity framework considered for mapping the TTP is MITRE. The simulated attacks includes all the privilege escalation payloads for Linux, recently discovered CVEs, emulations of key-loggers and APTs like APT41, APT28, APT29, Turla. An effective way to make the log/records whether it is general or suspicious is to filter the log if it is TTP tagged, that means it's suspicious/malicious otherwise it is considered as general. While developing the dataset we have The dataset is also useful for analysing all the critical log resources in the Linux environment that could be considered while performing forensics activity.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Xiaoli Wang; Xiaofeng Ding; Anthony K. H. Tung; Shanshan Ying; Hai Jin (2024). Linux Dataset [Dataset]. https://paperswithcode.com/dataset/linux

Linux Dataset

Linux Program Dependence Graphs

Explore at:
Dataset updated
Feb 7, 2024
Authors
Xiaoli Wang; Xiaofeng Ding; Anthony K. H. Tung; Shanshan Ying; Hai Jin
Description

The LINUX dataset consists of 48,747 Program Dependence Graphs (PDG) generated from the Linux kernel. Each graph represents a function, where a node represents one statement and an edge represents the dependency between the two statements

Search
Clear search
Close search
Google apps
Main menu