This is Titanic dataset
Attributes | Definition | Key |
---|---|---|
sex | Sex/Gender | male/female |
age | Age | |
sibsp | siblings of the passenger | 0/1 /2 ... |
parch | parents / children aboard the Titanic | 0/1/2 ... |
fare | Passenger fare | |
embarked | Port of Embarkation | C : Cherbourg, Q : Queenstown, S : Southampton |
class | Ticket class | First / Second / Third |
who | categories to passengers | male, female, child |
alone | he was alone in ship or no | 0/1 |
survived | 0/1 |
This is a classic dataset used in many data mining tutorials and demos -- perfect for getting started with exploratory analysis and building binary classification models to predict survival.
Data covers passengers only, not crew.
http://i.imgur.com/sz2sj47.png" alt="Imgur">
http://i.imgur.com/FLDktH4.png" alt="Imgur">
Dataset describing the survival status of individual passengers on the Titanic. Missing values in the original dataset are represented using ?. Float and int missing values are replaced with -1, string missing values are replaced with 'Unknown'.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('titanic', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Detail Description:
The Titanic dataset offers a comprehensive glimpse into the passengers aboard the ill-fated RMS Titanic, which famously sank on its maiden voyage in April 1912 after colliding with an iceberg. This dataset contains a wealth of information about individual passengers, including demographics, ticket class, cabin information, family relationships, fare details, and most notably, survival outcomes.
Key attributes within the dataset include:
Passenger Class (Pclass): This categorical variable indicates the ticket class of each passenger, ranging from 1st class (wealthiest) to 3rd class (lower socioeconomic status).
Name: The names of passengers, providing insight into their identities.
Sex: Gender of passengers, categorized as male or female.
Age: Age of passengers, providing information about the demographic composition of the Titanic's passengers.
SibSp: Number of siblings/spouses aboard the Titanic, offering insight into family relationships.
Parch: Number of parents/children aboard the Titanic, indicating family size and composition.
Ticket: Ticket number, providing additional information about passenger accommodations and fare details.
Fare: Fare paid by each passenger, which can be indicative of their ticket class and economic status.
Cabin: Cabin number or location, offering insights into passenger accommodations.
Embarked: Port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton), providing information about passengers' embarkation points.
Survived: This binary variable indicates whether a passenger survived the disaster (1) or not (0), serving as the primary outcome variable for analyses.
Researchers and data analysts frequently utilize the Titanic dataset for various purposes, including:
Overall, the Titanic dataset serves as a valuable resource for understanding historical events, exploring data analysis techniques, and teaching machine learning concepts. Its accessibility and rich contextual information make it a popular choice for both educational and research purposes within the data science community.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset is very similar to what was offered by the Titanic competition. Unfortunately, many wrong data spoiled the original Titanic competition dataset. I discussed this topic in more detail here.
It is provided by an R package stablelearner. For ease of use, I converted it from RDA to CSV format. I tried to make it error-free. Moreover, there are far fewer missing data.
It lacks the Cabin
feature, but it has Country
of nationality. Moreover, it has far fewer missing values.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Titanic dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/brendan45774/test-file on 21 November 2021.
--- Dataset description provided by original source is as follows ---
I took the titanic test file and the gender_submission and put them together in excel to make a csv. This is great for making charts to help you visualize. This also will help you know who died or survived. At least 70% right, but its up to you to make it 100% Thanks to the titanic beginners competitions for providing with the data. Please Upvote my dataset, it will mean a lot to me. Thank you!
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Titanic DataSet from Kaggle’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/sureshbhusare/titanic-dataset-from-kaggle on 29 August 2021.
--- No further description of dataset provided by original source ---
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Titanic-Dataset (train.csv)’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/hesh97/titanicdataset-traincsv on 12 November 2021.
--- No further description of dataset provided by original source ---
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Cleaned Titanic Data Set for EDA’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/jagjeet555/cleaned-titanic-data-set-for-eda on 30 September 2021.
--- Dataset description provided by original source is as follows ---
This is a Very famous Titanic dataset but it has been cleaned using various statistical method.
This Data set contain details of Various Titanic Passengers which include there Passenger ID, Survived (0= Not Survived, 1= Survived), Passenger Class (There are 3 Classes of Passenger In our Data Sets), Sex, Age, SibSp (it stand for Sibling and Spouse), Parch (It Stands for Parents and Children)
Learnt to do this by learning from Open source Platform like Python with Mosh, Data Analysis with Jovian, and Kaggle
You are welcome to do EDA on this Data
--- Original source retains full ownership of the source dataset ---
Hugo0133/Spaceship-Titanic dataset hosted on Hugging Face and contributed by the HF Datasets community
This dataset was created by Disha Agarwal
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Titanic_Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/ozlemilgun/titanic-dataset on 30 September 2021.
--- No further description of dataset provided by original source ---
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Titanic: Machine Learning from Disaster’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/shuofxz/titanic-machine-learning-from-disaster on 30 September 2021.
--- No further description of dataset provided by original source ---
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
titanic dataset
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Titanic’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/prkukunoor/TitanicDataset on 30 September 2021.
--- Dataset description provided by original source is as follows ---
I am planning to compare Above 18 years of male and female between the different class passengers in Titanic Data set
I noticed that more women survived in raw number and percentage than men and opposite are true of 3rd class passengers. The bars are a good choice to show the difference between categories, but you may want to look into a grouped bar chart1 for an easier comparison of how many survived or didn't in each group. While there were far more men on the boat, less survived than the women. The class seemed to have a direct effect on a passenger's chance of survival. While it is good to see the difference in the numbers of those who survived to those who didn't.
--- Original source retains full ownership of the source dataset ---
This dataset was created by Ryan Selesnik
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Titanic data for Data Preprocessing’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/akshaysehgal/titanic-data-for-data-preprocessing on 12 November 2021.
--- Dataset description provided by original source is as follows ---
Public "Titanic" dataset for data exploration, preprocessing and benchmarking basic classification/regression models.
Github: https://github.com/mwaskom/seaborn-data/blob/master/titanic.csv
Playground for visualizations, preprocessing feature engineering, model pipelining, and more.
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Titanic: cleaned data’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/jamesleslie/titanic-cleaned-data on 30 September 2021.
--- Dataset description provided by original source is as follows ---
This dataset was created in this notebook as part of a three-part series. The data is in machine-learning-ready format, with all missing values for the Age
, Fare
and Embarked
columns having been imputed.
Age
: this column was imputed by using the median age for the passenger's title (Mr, Mrs, Dr etc).Fare
: the single missing value in this column was imputed using the median value for that passenger's class.Embarked
: the two missing values here were imputed using the Pandas backfill
method.This data is used in both the second and third parts of the series.
--- Original source retains full ownership of the source dataset ---
This is Titanic dataset
Attributes | Definition | Key |
---|---|---|
sex | Sex/Gender | male/female |
age | Age | |
sibsp | siblings of the passenger | 0/1 /2 ... |
parch | parents / children aboard the Titanic | 0/1/2 ... |
fare | Passenger fare | |
embarked | Port of Embarkation | C : Cherbourg, Q : Queenstown, S : Southampton |
class | Ticket class | First / Second / Third |
who | categories to passengers | male, female, child |
alone | he was alone in ship or no | 0/1 |
survived | 0/1 |