100+ datasets found

P
Linux Dataset
paperswithcode.com
Updated Feb 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xiaoli Wang; Xiaofeng Ding; Anthony K. H. Tung; Shanshan Ying; Hai Jin (2024). Linux Dataset [Dataset]. https://paperswithcode.com/dataset/linux
Explore at:
Dataset updated
Feb 7, 2024
Authors
Xiaoli Wang; Xiaofeng Ding; Anthony K. H. Tung; Shanshan Ying; Hai Jin
Description
The LINUX dataset consists of 48,747 Program Dependence Graphs (PDG) generated from the Linux kernel. Each graph represents a function, where a node represents one statement and an edge represents the dependency between the two statements
E
Linux Statistics 2024 By Market Share, Usage Data, Number Of Users and Facts...
enterpriseappstoday.com
Updated Mar 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
EnterpriseAppsToday (2024). Linux Statistics 2024 By Market Share, Usage Data, Number Of Users and Facts [Dataset]. https://www.enterpriseappstoday.com/stats/linux-statistics.html
Explore at:
Dataset updated
Mar 5, 2024
Dataset authored and provided by
EnterpriseAppsToday
License
https://www.enterpriseappstoday.com/privacy-policyhttps://www.enterpriseappstoday.com/privacy-policy
Time period covered
2022 - 2032
Area covered
Global
Description
Editorâ€™s Choice

Pro developers are enthusiastic about using Linux operating systems and are likeÂ 47%.Â (Statista)

Its capabilities account forÂ 39.2% of websitesÂ whose operating systems are known.Â (W3Techs)

85% of smartphonesÂ are powered by Linux (Hayden James).

Its marketing share isÂ 2.09 %. is, without doubt, third in Statista's top ten most-used desktop operating systems.

By 2027, the global Linux market will reachÂ $15.64 trillion.Â (Fortune Business Insights).

Linux is the operating system of all the world's fastest supercomputers.Â (Blackdown)

96.3%Â of The top 1,000,000 web servers use Linux.Â (ZDNet)

Active Linus distros are still available today(Tecmint).

In 2022, Linux claimed a 34% market share in the container orchestration market, reflecting its versatility.

Linux holds approximately 3% of the desktop operating system market share according to StatCounter.

The Linux kernel, the core of the operating system, consists of more than 80 million lines of code, showcasing its complexity and robustness.

Linux is the preferred choice for web hosting, powering more than 95% of the top 1 million websites.

It enjoys ubiquity in the supercomputing realm, running on 100% of the world's supercomputers.

In 2022, Linux server revenue reached an impressive USD 13.4 billion, indicating its economic significance.

Linux offers a diverse landscape with over 500 different distributions catering to various user needs.

The cloud computing landscape heavily relies on Linux, with over 90% of public cloud workloads being Linux-based.

Despite its desktop market share, Linux accounts for approximately 36.7% of all operating systems on desktop computers.

Android, built on the Linux kernel, dominates the global smartphone market with a staggering 70% market share.

The Linux server market is expected to continue growing with an annual growth rate of 8.6%.

Linux's prowess extends to supercomputing, as over 92% of the world's top 500 fastest supercomputers run on Linux.

The global community of active Linux users exceeds a remarkable 100 million.

Android, a Linux-based operating system, sees over 1.5 billion smartphone shipments annually.

The embedded Linux market boasts an estimated size of USD 5.3 billion, signifying its role in various industries.

Linux is the platform of choice for 68% of IoT devices and systems, indicating its reliability.

With over 25,000 contributors, the Linux kernel enjoys robust development and continuous improvement.

The enterprise Linux market is projected to reach USB 14.4 billion by 2025, driven by business adoption.

Linux's reputation for security is strong, being considered 10 times more secure than some other operating systems.

Linux plays a critical role in the world's stock exchanges, powering more than 75% of them.

Its open-source nature leads to an average of 10,000 lines of code added daily, demonstrating its active development.

Linux remains the primary operating system for 70% of web servers globally.

Job postings related to Linux have surged by 31% in the last year, highlighting the high demand for Linux professionals.

You May Also Like To Read

DDoS Statistics

Virtual Reality Statistics

Oracle Statistics

Augmented Reality Statistics

Machine Learning Statistics

Most Popular Programming Languages Statistics

Microsoft Teams Statistics

Key Linux Statistics

Market Share and Usage: Linux holds a 2.09% share of the desktop operating system market, making it the third most popular after Windows and macOS. In the server domain, Linux accounted for 13.6% globally in 2019, showcasing significant usage in infrastructure.

Market Size: The global Linux market is expected to reach $15.64 billion by 2027, with a compound annual growth rate (CAGR) of 19.2%. This growth is driven by the increasing number of servers, rising internet penetration rates, and the expansion of data centers.

Cloud Infrastructure: Linux dominates cloud computing, powering over 90% of cloud infrastructure. It is the backbone for major public cloud providers like AWS, Google Cloud Platform, and Microsoft Azure due to its scalability, security, and cost-effectiveness.

Mobile Devices: Linux significantly impacts the mobile device market, with Android, the most widely used mobile operating system, based on the Linux kernel. Android's open-source nature allows for extensive customization.

Security: Known for its robust security features, Linux is a popular choice for organizations and individuals concerned about data protection. Its kernel includes various security features like access control lists (ACLs), mandatory access control (MAC), and secure computing mode (seccomp).

IoT and Smart Devices: In the Internet of Things (IoT) ecosystem, Linux powers a wide range of devices, from smart home devices to industrial IoT applications, thanks to its stability and flexibility.

Developer Preference: Linux is preferred by 47% of professional developers, highlighting its flexibility and the availability of compilers or interpreters not found in Windows.

Web Servers: Linux powers 96.3% of the top one million web servers, indicating its widespread adoption in internet infrastructure.

Linux Distributions: Over 600 active Linux distributions are available, with Ubuntu being the most popular, accounting for 33.9% of the Linux market. This is followed by Debian and CentOS.

Future Prospects: The future of Linux includes promising growth in edge computing and the automotive industry, where its stability, security, and open-source nature are highly valued.
s
The Linux Command Line for Beginners - Dataset - STC Training Center
training.stcenter.net
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). The Linux Command Line for Beginners - Dataset - STC Training Center [Dataset]. https://training.stcenter.net/dataset/the-linux-command-line-for-beginners
Explore at:
Dataset updated
Jan 29, 2022
Description
This guide is an overview of Linux commands for making directories, moving files, listing files (ls), changing directories, and so forth. It is publicly available through the ubuntu website and is considered a general training material.
h
linux-man-pages-tldr-summarized
huggingface.co
Updated Sep 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tamas Kiss (2023). linux-man-pages-tldr-summarized [Dataset]. https://huggingface.co/datasets/tmskss/linux-man-pages-tldr-summarized
Explore at:
Dataset updated
Sep 19, 2023
Authors
Tamas Kiss
Description
Dataset Card for linux-man-pages-tldr-summarized

Dataset Summary

This dataset contains linux man pages downloaded from man7, with a prefix: 'summarize: ', and the corresponding summarization downloaded from TLDR-pages.

Supported Tasks

This dataset should be used to fine-tune language models for summarization tasks.
b
Linux Hint BD Events Database
linuxhintbd.blogspot.com
csv, xml
Updated Nov 3, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
LLC/Linux/Hint/BD > National Centers for Environmental Information, Linux Hint BD, NOAA, B.D. Department of IT (2022). Linux Hint BD Events Database [Dataset]. https://linuxhintbd.blogspot.com/
Explore at:
xml, csvAvailable download formats
Dataset updated
Nov 3, 2022
Dataset authored and provided by
LLC/Linux/Hint/BD > National Centers for Environmental Information, Linux Hint BD, NOAA, B.D. Department of IT
License
https://www.linuxhintbd.xyzhttps://www.linuxhintbd.xyz
Time period covered
Jan 1, 1950 - Dec 18, 2013
Area covered
Pacific Ocean, North Pacific Ocean
Dataset funded by
Linux Server Support & IT Support Service
Description
Linux Hint BD Storm Data is provided by the National Weather Service (NWS) and contain statistics on...
Linux Kernel Git Revision History
kaggle.com
zip
Updated Mar 11, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Philipp Schmidt (2017). Linux Kernel Git Revision History [Dataset]. https://www.kaggle.com/datasets/philschmidt/linux-kernel-git-revision-history
Explore at:
zip(45917448 bytes)Available download formats
Dataset updated
Mar 11, 2017
Authors
Philipp Schmidt
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
This dataset contains commits with detailed information about changed files from about 12 years of the linux kernel master branch. It contains about 600.000 (filtered) commits and this breaks down to about 1.4 million file change records.

Each row represents a changed file in a specific commit, with annotated deletions and additions to that file, as well as the filename and the subject of the commit. I also included anonymized information about the author of each changed file aswell as the time of commit and the timezone of the author.

The columns in detail:

author_timestamp: UNIX timestamp of when the commit happened

commit_hash: SHA-1 hash of the commit

commit_utc_offset_hours: Extraced UTC offset in hours from commit time

filename: The filename that was changed in the commit

n_additions: Number of added lines

n_deletions: Number of deleted lines

subject: Subject of commit

author_id: Anonymized author ID.

I'm sure with this dataset nice visualizations can be created, let's see what we can come up with!

For everybody interested how the dataset was created, I've setup a github repo that contains all the required steps to reproduce it here.

If you have any questions, feel free to contact me via PM or discussions here.
Linux Kernel binary size
zenodo.org
csv, json
Updated Jun 14, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hugo MARTIN; Hugo MARTIN; Mathieu ACHER; Mathieu ACHER (2021). Linux Kernel binary size [Dataset]. http://doi.org/10.5281/zenodo.4943884
Explore at:
json, csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4943884
Dataset updated
Jun 14, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Hugo MARTIN; Hugo MARTIN; Mathieu ACHER; Mathieu ACHER
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset containing measurements of Linux Kernel binary size after compilation. The reported size, in the column "perf", is the size in bytes of the vmlinux file. In contains also a column "active_options" reporting the number of activated options (set at "y"). All other columns, the list being reported in the file "Linux_options.json", are Linux kernel options. The sampling have been made using randconfig. The version of Linux used is 4.13.3.

Not all available options are present. First, it only contains options about the x86 and 64 bits version. Then, all non-tristate options have been ignored. Finally, options not having multiple value through the whole dataset, due to not enough variability in the sampling, are ignored. All options are encoded as 0 for "n" and "m" options value, and 1 for "y".

In python, importing the dataset using pandas will attribute all columns to int64, which will lead to a great consumption of memory (~50GB). We provide this way to import it using less than 1 GB of memory by setting options columns to int8.

import pandas as pd import json import numpy with open("Linux_options.json","r") as f: linux_options = json.load(f) # Load csv by setting options as int8 to save a lot of memory return pd.read_csv("Linux.csv", dtype={f:numpy.int8 for f in linux_options})
w
Linux
workwithdata.com
Updated Jan 10, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2022). Linux [Dataset]. https://www.workwithdata.com/book/Linux_67355
Explore at:
Dataset updated
Jan 10, 2022
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Explore Linux through unique data from multiples sources: key facts, real-time news, interactive charts, detailed maps & open datasets
Build and measurements of Linux kernel configurations across different...
zenodo.org
bin
Updated Dec 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mathieu Acher; Mathieu Acher; Juliana Alves Pereira; Hugo Martin; Luc Lesoil; Jean-Marc Jézéquel; Djamel Eddine Khelladi; Juliana Alves Pereira; Hugo Martin; Luc Lesoil; Jean-Marc Jézéquel; Djamel Eddine Khelladi (2022). Build and measurements of Linux kernel configurations across different versions [Dataset]. http://doi.org/10.5281/zenodo.7433623
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7433623
Dataset updated
Dec 14, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Mathieu Acher; Mathieu Acher; Juliana Alves Pereira; Hugo Martin; Luc Lesoil; Jean-Marc Jézéquel; Djamel Eddine Khelladi; Juliana Alves Pereira; Hugo Martin; Luc Lesoil; Jean-Marc Jézéquel; Djamel Eddine Khelladi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
With large scale and complex configurable systems, it is hard for users to choose the right combination of options (i.e., configurations) in order to obtain the wanted trade-off between functionality and performance goals such as speed or size. Machine learning can help in relating these goals to the configurable system options, and thus, predict the effect of options on the outcome, typically after a costly training step. However, many configurable systems evolve at such a rapid pace that it is impractical to retrain a new model from scratch for each new version. Taking the extreme case of the Linux kernel with its ≈ 14, 500 configuration options, we investigate how binary size predictions of kernel size degrade over successive versions (and how transfer learning can be adapted and applied to mitigate this degradation).

We used and are sharing a unique and large dataset constituted of the binary sizes (compressed and non-compressed) of thousands of configurations for different versions of the kernel, spanning three years (4.13, 4.15, 4.20, 5.0, 5.4, 5.7, and 5.8). Overall, around 200K configurations over 10K+ options/features and 6 versions.

This dataset has been used in the Transactions of Software Engineering (TSE) article "Transfer Learning Across Variants and Versions: The Case of Linux Kernel Size" (preprint: https://hal.inria.fr/hal-03358817)

Linux-Logs

kaggle.com

Updated Feb 23, 2023

Facebook

Twitter

Click to copy link

Link copied

Cite

(2023). Linux-Logs [Dataset]. https://www.kaggle.com/datasets/ggsri123/linux-logs

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Feb 23, 2023

Description

Linux Logs

About the Dataset

This data contains 2k log lines from the Linux Dataset, derived from LogPai Github Repository. The first file contains just the log lines. Second, contains the log lines with their categorized fields - namely Month, Date, Time, Level, Component, PID, Content, EventID, and EventTemplate.

Interesting Task Ideas

1. Understanding the frequency of different Event Types (EventID) that occur in the log set.
2. Identifying anomaly in the logs, if it exists.
3. Named Entity Recognition - To identify different fields of the log set from the set-aside data.
4. Multiclass classification - To identify what Event Type (EventID) the log line belongs to. 
5. Adding variable parts (<*>) a name, and adding it to the entity recognition task. [Boss level!]

Point 5 explanation: In the 3rd file named Linux_2k.log_templates.csv, for each of the event types (given by EventIDs) there is a template. The template consists of a variable portion (given by <*>) and a constant portion (the other words in the template). The value of this variable part can be found by comparing the template against the log line containing this template. A name could be assigned to the variable part and be accounted for named entity recognition. Keep in mind the frequency of a variable part might be limited.

Note: An important idea to have in mind is that one will have to focus on the syntax more than the semantics of a log line.

Have fun understanding how to apply NLP concepts to Log Datasets! 😀

Check out my other Datasets here

MIT License

Copyright (c) 2018 LogPAI

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

h
stackoverflow_linux
huggingface.co
Updated Oct 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Konrad Szafer (2023). stackoverflow_linux [Dataset]. https://huggingface.co/datasets/KonradSzafer/stackoverflow_linux
Explore at:
Dataset updated
Oct 6, 2023
Authors
Konrad Szafer
Description
Dataset Card for "stackoverflow_linux"

Dataset information:

Source: Stack Overflow Category: Linux Number of samples: 300 Train/Test split: 270/30 Quality: Data come from the top 1k most upvoted questions

Additional Information License

All Stack Overflow user contributions are licensed under CC-BY-SA 3.0 with attribution required. More Information needed
Data from: A Study of Feature Scattering in the Linux Kernel
ieee-dataport.org
Updated Dec 12, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Leonardo Passos (2018). A Study of Feature Scattering in the Linux Kernel [Dataset]. http://doi.org/10.21227/aswj-q655
Explore at:
Unique identifier
https://doi.org/10.21227/aswj-q655
Dataset updated
Dec 12, 2018
Dataset provided by
Institute of Electrical and Electronics Engineershttp://www.ieee.ro/
Authors
Leonardo Passos
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Feature code is often scattered across a software system. Scattering is not necessarily bad if used with care, as witnessed by systems with highly scattered features that evolved successfully. Feature scattering, often realized with a pre-processor, circumvents limitations of programming languages and software architectures. Unfortunately, little is known about the principles governing scattering in large and long-living software systems. We present a longitudinal study of feature scattering in the Linux kernel, complemented by a survey with 74, and interviews with nine Linux kernel developers. We analyzed almost eight years of the kernel's history, focusing on its largest subsystem: device drivers. We learned that the ratio of scattered features remained nearly constant and that most features were introduced without scattering. Yet, scattering easily crosses subsystem boundaries, and highly scattered outliers exist. Scattering often addresses a performance-maintenance tradeoff (alleviating complicated APIs), hardware design limitations, and avoids code duplication. While developers do not consciously enforce scattering limits, they actually improve the system design and refactor code, thereby mitigating pre-processor idiosyncrasies or reducing its use.
f
Linux Kernel vulnerability dataset
figshare.com
application/gzip
Updated Feb 12, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maciek Makowski (2019). Linux Kernel vulnerability dataset [Dataset]. http://doi.org/10.6084/m9.figshare.7710176.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.7710176.v1
Dataset updated
Feb 12, 2019
Dataset provided by
figshare
Authors
Maciek Makowski
License
https://www.gnu.org/copyleft/gpl.htmlhttps://www.gnu.org/copyleft/gpl.html
Description
File versions from Linux Kernel, with vulnerability labels derived from the CVE database. Based on the work by Jimenez et al. [1].[1] Jimenez, Matthieu, Mike Papadakis, and Yves Le Traon. "Vulnerability prediction models: A case study on the linux kernel." Source Code Analysis and Manipulation (SCAM), 2016 IEEE 16th International Working Conference on. IEEE, 2016.
w
Data from: A practical guide to Linux
workwithdata.com
Updated May 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2023). A practical guide to Linux [Dataset]. https://www.workwithdata.com/book/A%20practical%20guide%20to%20Linux_11269
Explore at:
Dataset updated
May 18, 2023
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Explore A practical guide to Linux through unique data from multiples sources: key facts, real-time news, interactive charts, detailed maps & open datasets
Games for Linux
kaggle.com
Updated Nov 10, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2022). Games for Linux [Dataset]. https://www.kaggle.com/datasets/thedevastator/list-of-fun-games-to-play-on-linux
Explore at:
Dataset updated
Nov 10, 2022
Dataset provided by
Kaggle
Authors
The Devastator
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Games for Linux

A Dataset of games released for the Linux OS

About this dataset

This dataset includes a list of games that have been released on the Linux operating system. Games that are natively playable on Linux are included in this list.

With so many amazing games available to play on Linux, it can be hard to decide which ones to try first. This comprehensive list includes something for everyone, from fast-paced action games to relaxing puzzle games and everything in between. With such a wide variety of genres represented, there's sure to be something here that appeals to you.

So what are you waiting for? Give one (or more) of these Linux games a try today!

How to use the dataset

This dataset can be used to find a list of games that have been released on the Linux operating system. Games that are natively playable on Linux are included in this list. The dataset includes the name of the game, the developer, the publisher, the genres, the operating systems, the date released, and the Metacritic score

Research Ideas

Checking for release dates of Linux games.

Finding the Metacritic score for a particular game.

Searching for a specific game by name or genre

Acknowledgements

This dataset was compiled by scraping wikipedia

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: df_16.csv | Column name | Description | |:----------------------|:----------------------------------------------------------| | Name | The name of the game. (String) | | Developer | The game's developer. (String) | | Publisher | The game's publisher. (String) | | Genres | The genres the game belongs to. (String) | | Operating Systems | The operating systems the game can be played on. (String) | | Date Released | The date the game was released. (Date) | | Metacritic | The game's Metacritic score. (Integer) |

File: df_26.csv | Column name | Description | |:----------------------|:----------------------------------------------------------| | Name | The name of the game. (String) | | Developer | The game's developer. (String) | | Publisher | The game's publisher. (String) | | Genres | The genres the game belongs to. (String) | | Operating Systems | The operating systems the game can be played on. (String) | | Date Released | The date the game was released. (Date) | | Metacritic | The game's Metacritic score. (Integer) |

File: df_20.csv | Column name | Description | |:----------------------|:----------------------------------------------------------| | Name | The name of the game. (String) | | Developer | The game's developer. (String) | | Publisher | The game's publisher. (String) | | Genres | The genres the game belongs to. (String) | | Operating Systems | The operating systems the game can be played on. (String) | | Date Released | The date the game was released. (Date) | | Metacritic | The game's Metacritic score. (Integer) |

File: df_18.csv | Column name | Description | |:----------------------|:----------------------------------------------------------| | Name | The name of the game. (String) | | Developer | The game's developer. (String) | | Publisher | The game's publisher. (String) | | Genres | The genres the game belongs to. (String) | | Operating Systems | The operating systems the game can be played on. (String) | | Date Released | The date the game was released. (Date) | | Metacritic | The game's Metacritic score. (Integer) |

File: df_25.csv | Column name | Description | |:----------------------|:----------------------------------------------------------| | Name | The name of the game. (String) | | Developer | The game's developer. (String) | | Publisher | The game's publisher. (String) | | Genres | The genres the game belongs to. (String) | | Operating Systems | The operating systems the game can be played on. (String) | | Date Released | The date the game was released. (Date) | | Metacritic | The game's Metacritic score. (Integer) |

File: df_27.csv | Column name | Description | |:----------------------|:----------------------------------------------------------| | Name | The name of the game. (String) | | Developer | The game's developer. (String) | | Publisher | The game's publisher. (String) | | Genres | The genres the game belongs to. (String) | | Operating Systems | The operating systems the game can be played on. (String) | | Date Released | The date the game was released. (Date) | | Metacritic | The game's Metacritic score. (Integer) |

File: df_11.csv | Column name | Description | |:----------------------|:----------------------------------------------------------| | Name | The name of the game. (String) | | Developer | The game's developer. (String) | | Publisher | The game's publisher. (String) | | Genres | The genres the game belongs to. (String) | | Operating Systems | The operating systems the game can be played on. (String) | | Date Released | The date the game was released. (Date) | | Metacritic | The game's Metacritic score. (Integer) |

File: df_1.csv | Column name | Description | |:--------------|:---------------------------| | 0 | Name of the game. (String) |

File: df_31.csv

File: df_4.csv | Column name | Description | |:----------------------|:----------------------------------------------------------| | Name | The name of the game. (String) | | Developer | The game's developer. (String) | | Publisher | The game's publisher. (String) | | Genres | The genres the game belongs to. (String) | | Operating Systems | The operating systems the game can be played on. (String) | | Date Released | The date the game was released. (Date) | | Metacritic | The game's Metacritic score. (Integer) |

File: df_21.csv | Column name | Description | |:----------------------|:----------------------------------------------------------| | Name | The name of the game. (String) | | Developer | The game's developer. (String) | | Publisher | The game's publisher. (String) | | Genres | The genres the game belongs to. (String) | | Operating Systems | The operating systems the game can be played on. (String) | | Date Released | The date the game was released. (Date) | | Metacritic | The game's Metacritic score. (Integer) |

File: df_17.csv | Column name | Description | |:----------------------|:----------------------------------------------------------| | Name | The name of the game. (String) | | Developer | The game's developer. (String) | | Publisher | The game's publisher. (String) | | Genres | The genres the game belongs to. (String) | | Operating Systems | The operating systems the game can be played on. (String) | | Date Released | The date the game was released. (Date) | | Metacritic | The game's Metacritic score. (Integer) |

File: df_24.csv | Column name | Description
w
Data from: Linux and the Unix philosophy
workwithdata.com
Updated Jan 10, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2022). Linux and the Unix philosophy [Dataset]. https://www.workwithdata.com/book/Linux%20and%20the%20Unix%20philosophy_132181
Explore at:
Dataset updated
Jan 10, 2022
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Explore Linux and the Unix philosophy through unique data from multiples sources: key facts, real-time news, interactive charts, detailed maps & open datasets
f
FEVER - feature oriented history of the Linux kernel
figshare.com
zip
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nicolas Dintzner (2023). FEVER - feature oriented history of the Linux kernel [Dataset]. http://doi.org/10.4121/uuid:c478028a-ac6d-4c45-9e2d-8ad63c7ca75f
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/uuid:c478028a-ac6d-4c45-9e2d-8ad63c7ca75f
Dataset updated
May 31, 2023
Dataset provided by
4TU.ResearchData
Authors
Nicolas Dintzner
License
https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use
Description
This dataset contains changes performed by developers over 15 releases of the Linux kernel. This dataset cover the feature-oriented change history of the kernel between releases 3.10 and 4.4. The changes are broken down by affected artefacts, and all changes pertaining to the same feature are regrouped together. If you want to know things like: How many time was a feature touched in the kernel? How many feature changes came with makefile adjustments ? Then this dataset may interest you. To access the data, you can install a Neo4j server via http://neo4j.com/
i
Computer use, by Autonomous community and knowledge of LINUX operating...
ine.es
csv, html, json +4
Updated Aug 21, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
INE - Instituto Nacional de Estadística (2014). Computer use, by Autonomous community and knowledge of LINUX operating system [Dataset]. https://www.ine.es/jaxi/tabla.do?path=/t25/p450/base_2011/a2007/l1/&file=08034.px&type=pcaxis&L=1
Explore at:
xls, html, json, txt, csv, xlsx, text/pc-axisAvailable download formats
Dataset updated
Aug 21, 2014
Dataset authored and provided by
INE - Instituto Nacional de Estadística
License
https://www.ine.es/aviso_legalhttps://www.ine.es/aviso_legal
Variables measured
Autonomous Communities, Use of knowledge of the LINUX operating system
Description
Survey on Equipment and Use of Information and Communication Technologies in Households: Computer use, by Autonomous community and knowledge of LINUX operating system. Autonomous Communities.
Linux Kernel Commits-per-Month Until 2019.
kaggle.com
zip
Updated Mar 22, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joshua Isanan (2020). Linux Kernel Commits-per-Month Until 2019. [Dataset]. https://www.kaggle.com/joshuaisanan/linux-kernel-commitspermonth-until-2019
Explore at:
zip(2370 bytes)Available download formats
Dataset updated
Mar 22, 2020
Authors
Joshua Isanan
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Context

I wanted to create a time-series forecasting model for tracking the number of commits on the Linux kernel repository per month. I extracted the data from https://www.phoronix.com/misc/linux-eoy2019/activity.html which already tracks the Linux repository using Gitstat. Having realized that the community might be able to create more robust models. I decided to upload the extracted data here as well. Link to repository: https://github.com/torvalds/linux

Content

The dataset contains the Linux Repository commit/lines added/lines deleted count that's been collected on a monthly basis. Phoronix's last extraction was performed by the end of 2019.

Inspiration

I would love to see how the community would create time-series models on a dataset that's as limited as this.

Additional Notes

Image Taken From: Donald Clark from Pixabay
d
Linux-APT-Dataset-2024 - Dataset - B2FIND
b2find.dkrz.de
Updated Mar 5, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Linux-APT-Dataset-2024 - Dataset - B2FIND [Dataset]. https://b2find.dkrz.de/dataset/2592968d-3db1-5dbb-8226-50ea976d8872
Explore at:
Dataset updated
Mar 5, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A novel dataset "Linux-APT-Dataset-2024” that includes the Tactics, Techniques and Procedure (TTPs) of APT attacks in Linux environment. There are two files one is 'combine.csv' and other is 'Processed Version.xlsx' both includes 17 files ranging from 01st October 2023 to 07 January 2024 and each of the file contains all the essential data fields. These 17 files can be found in the below link.Karim, S. (2024). Linux-APT-Dataset-2024 [Data set]. Zenodo. https://doi.org/10.5281/zenodo.106856421. combine.csv is the raw file that is a merger of all the 17 files extracted from SIEM 'WAZUH' after the simulation of latest attacks in the environment. Due to SIEM's 'WAZUH' limitation to produce files with more than 10,000 records, all of the files are combined that could be used as input for other analyses.2. Processed Version.xlsx is the compiled version of combine.csv, the file extension is changed to xlsx because of the support available in most of the system, also Tactics and Techniques are separated for convenience of different researchers. It is also tagged with General and Malicious, if the value is 1 means it is suspicious/malicious, otherwise 0 for General/Normal log.Regarding dataset, it contains both type of activities/logs general as well as malicious/suspicious to make the dataset near real-time for better analysis and evaluation. It will be more productive if the cybersecurity framework considered for mapping the TTP is MITRE. The simulated attacks includes all the privilege escalation payloads for Linux, recently discovered CVEs, emulations of key-loggers and APTs like APT41, APT28, APT29, Turla. An effective way to make the log/records whether it is general or suspicious is to filter the log if it is TTP tagged, that means it's suspicious/malicious otherwise it is considered as general. While developing the dataset we have The dataset is also useful for analysing all the critical log resources in the Linux environment that could be considered while performing forensics activity.

Facebook

Twitter

Click to copy link

Link copied

Cite

Xiaoli Wang; Xiaofeng Ding; Anthony K. H. Tung; Shanshan Ying; Hai Jin (2024). Linux Dataset [Dataset]. https://paperswithcode.com/dataset/linux

Linux Dataset

Linux Program Dependence Graphs

Explore at:

Dataset updated

Feb 7, 2024

Authors

Xiaoli Wang; Xiaofeng Ding; Anthony K. H. Tung; Shanshan Ying; Hai Jin

Description

The LINUX dataset consists of 48,747 Program Dependence Graphs (PDG) generated from the Linux kernel. Each graph represents a function, where a node represents one statement and an edge represents the dependency between the two statements

Clear search

Close search

Google apps

Main menu

Linux Dataset

Linux Statistics 2024 By Market Share, Usage Data, Number Of Users and Facts...

Editorâ€™s Choice

You May Also Like To Read

Key Linux Statistics

The Linux Command Line for Beginners - Dataset - STC Training Center

linux-man-pages-tldr-summarized

Linux Hint BD Events Database

Linux Kernel Git Revision History

Linux Kernel binary size

Linux

Build and measurements of Linux kernel configurations across different...

Linux-Logs

Linux Logs

About the Dataset

Interesting Task Ideas

stackoverflow_linux

Data from: A Study of Feature Scattering in the Linux Kernel

Linux Kernel vulnerability dataset

Data from: A practical guide to Linux

Games for Linux

Games for Linux

A Dataset of games released for the Linux OS

About this dataset

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

Data from: Linux and the Unix philosophy

FEVER - feature oriented history of the Linux kernel

Computer use, by Autonomous community and knowledge of LINUX operating...

Linux Kernel Commits-per-Month Until 2019.

Context

Content

Inspiration

Additional Notes

Linux-APT-Dataset-2024 - Dataset - B2FIND

Linux Dataset

Linux Program Dependence Graphs