100+ datasets found

MusicCaps
kaggle.com
huggingface.co
+1more
Updated Jan 25, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google Research (2023). MusicCaps [Dataset]. https://www.kaggle.com/datasets/googleai/musiccaps
Explore at:
Dataset updated
Jan 25, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Google Research
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
The MusicCaps dataset contains 5,521 music examples, each of which is labeled with an English aspect list and a free text caption written by musicians.

An aspect list is for example "pop, tinny wide hi hats, mellow piano melody, high pitched female vocal melody, sustained pulsating synth lead".

The caption consists of multiple sentences about the music, e.g., "A low sounding male voice is rapping over a fast paced drums playing a reggaeton beat along with a bass. Something like a guitar is playing the melody along. This recording is of poor audio-quality. In the background a laughter can be noticed. This song may be playing in a bar."

The text is solely focused on describing how the music sounds, not the metadata like the artist name.

The labeled examples are 10s music clips from the AudioSet dataset (2,858 from the eval and 2,663 from the train split).

Please cite the corresponding paper, when using this dataset: http://arxiv.org/abs/2301.11325 (DOI: 10.48550/arXiv.2301.11325)
m
Music Dataset: Lyrics and Metadata from 1950 to 2019
data.mendeley.com
Updated Aug 24, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luan Moura (2020). Music Dataset: Lyrics and Metadata from 1950 to 2019 [Dataset]. http://doi.org/10.17632/3t9vbwxgr5.2
Explore at:
Unique identifier
https://doi.org/10.17632/3t9vbwxgr5.2
Dataset updated
Aug 24, 2020
Authors
Luan Moura
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset provides a list of lyrics from 1950 to 2019 describing music metadata as sadness, danceability, loudness, acousticness, etc. We also provide some informations as lyrics which can be used to natural language processing.

The audio data was scraped using Echo Nest® API integrated engine with spotipy Python’s package. The spotipy API permits the user to search for specific genres, artists,songs, release date, etc. To obtain the lyrics we used the Lyrics Genius® API as baseURL for requesting data based on the song title and artist name.
d
FMA: A Dataset For Music Analysis
data.world
zip
Updated Mar 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UCI (2024). FMA: A Dataset For Music Analysis [Dataset]. https://data.world/uci/fma-a-dataset-for-music-analysis
Explore at:
zipAvailable download formats
Dataset updated
Mar 12, 2024
Dataset provided by
data.world, Inc.
Authors
UCI
Description
Source:

MichaÃ«l Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier Bresson, EPFL LTS2.

Data Set Information:

Audio track (encoded as mp3) of each of the 106,574 tracks. It is on average 10 millions samples per track.* Nine audio features (consisting of 518 attributes) for each of the 106,574 tracks.* Given the metadata, multiple problems can be explored: recommendation, genre recognition, artist identification, year prediction, music annotation, unsupervized categorization.* The dataset is split into four sizes: small, medium, large, full.* Please see the paper and the GitHub repository for more information ( ) # Attribute Information: Nine audio features computed across time and summarized with seven statistics (mean, standard deviation, skew, kurtosis, median, minimum, maximum):1. Chroma, 84 attributes2. Tonnetz, 42 attributes3. Mel Frequency Cepstral Coefficient (MFCC), 140 attributes4. Spectral centroid, 7 attributes5. Spectral bandwidth, 7 attributes6. Spectral contrast, 49 attributes7. Spectral rolloff, 7 attributes8. Root Mean Square energy, 7 attributes9. Zero-crossing rate, 7 attributes # Relevant Papers: N/A # Citation Request: MichaÃ«l Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier Bresson. FMA: A Dataset For Music Analysis. , 2017.

Source: http://archive.ics.uci.edu/ml/datasets/FMA%3A+A+Dataset+For+Music+Analysis
MuMu: Multimodal Music Dataset
zenodo.org
explore.openaire.eu
application/gzip, txt
Updated Dec 5, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sergio Oramas; Sergio Oramas (2022). MuMu: Multimodal Music Dataset [Dataset]. http://doi.org/10.5281/zenodo.831189
Explore at:
application/gzip, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.831189
Dataset updated
Dec 5, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Sergio Oramas; Sergio Oramas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
MuMu is a Multimodal Music dataset with multi-label genre annotations that combines information from the Amazon Reviews dataset and the Million Song Dataset (MSD). The former contains millions of album customer reviews and album metadata gathered from Amazon.com. The latter is a collection of metadata and precomputed audio features for a million songs.

To map the information from both datasets we use MusicBrainz. This process yields the final set of 147,295 songs, which belong to 31,471 albums. For the mapped set of albums, there are 447,583 customer reviews from the Amazon Dataset. The dataset have been used for multi-label music genre classification experiments in the related publication. In addition to genre annotations, this dataset provides further information about each album, such as genre annotations, average rating, selling rank, similar products, and cover image url. For every text review it also provides helpfulness score of the reviews, average rating, and summary of the review.

The mapping between the three datasets (Amazon, MusicBrainz and MSD), genre annotations, metadata, data splits, text reviews and links to images are available here. Images and audio files can not be released due to copyright issues.

MuMu dataset (mapping, metadata, annotations and text reviews)

Data splits and multimodal feature embeddings for ISMIR multi-label classification experiments

These data can be used together with the Tartarus deep learning library https://github.com/sergiooramas/tartarus.

Scientific References

Please cite the following paper if using MuMu dataset or Tartarus library.

Oramas S., Nieto O., Barbieri F., & Serra X. (2017). Multi-label Music Genre Classification from audio, text and images using Deep Features. In Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR 2017). https://arxiv.org/abs/1707.04916
m
Music Dataset: Lyrics and Metadata from 1950 to 2019
data.mendeley.com
narcis.nl
Updated Oct 23, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luan Moura (2020). Music Dataset: Lyrics and Metadata from 1950 to 2019 [Dataset]. http://doi.org/10.17632/3t9vbwxgr5.3
Explore at:
Unique identifier
https://doi.org/10.17632/3t9vbwxgr5.3
Dataset updated
Oct 23, 2020
Authors
Luan Moura
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset was studied on Temporal Analysis and Visualisation of Music paper, in the following link:

https://sol.sbc.org.br/index.php/eniac/article/view/12155

This dataset provides a list of lyrics from 1950 to 2019 describing music metadata as sadness, danceability, loudness, acousticness, etc. We also provide some informations as lyrics which can be used to natural language processing.

The audio data was scraped using Echo Nest® API integrated engine with spotipy Python’s package. The spotipy API permits the user to search for specific genres, artists,songs, release date, etc. To obtain the lyrics we used the Lyrics Genius® API as baseURL for requesting data based on the song title and artist name.
P
MusicCaps Dataset
paperswithcode.com
Updated Jun 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrea Agostinelli; Timo I. Denk; Zalán Borsos; Jesse Engel; Mauro Verzetti; Antoine Caillon; Qingqing Huang; Aren Jansen; Adam Roberts; Marco Tagliasacchi; Matt Sharifi; Neil Zeghidour; Christian Frank (2023). MusicCaps Dataset [Dataset]. https://paperswithcode.com/dataset/musiccaps
Explore at:
Dataset updated
Jun 7, 2023
Authors
Andrea Agostinelli; Timo I. Denk; Zalán Borsos; Jesse Engel; Mauro Verzetti; Antoine Caillon; Qingqing Huang; Aren Jansen; Adam Roberts; Marco Tagliasacchi; Matt Sharifi; Neil Zeghidour; Christian Frank
Description
MusicCaps is a dataset composed of 5.5k music-text pairs, with rich text descriptions provided by human experts. For each 10-second music clip, MusicCaps provides:

1) A free-text caption consisting of four sentences on average, describing the music and

2) A list of music aspects, describing genre, mood, tempo, singer voices, instrumentation, dissonances, rhythm, etc.
H
Music and emotion dataset (Primary Musical Cues)
dataverse.harvard.edu
datamed.org
bin, pdf, tsv
Updated Jan 18, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Harvard Dataverse (2016). Music and emotion dataset (Primary Musical Cues) [Dataset]. http://doi.org/10.7910/DVN/IFOBRN
Explore at:
pdf(34454), bin(509128639), tsv(3492), tsv(5987)Available download formats
Unique identifier
https://doi.org/10.7910/DVN/IFOBRN
Dataset updated
Jan 18, 2016
Dataset provided by
Harvard Dataverse
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Stimulus materials, design matrix, and mean ratings for music and emotion study using optimal design in factorial manipulation of musical features (Eerola, Friberg & Bresin, 2013).
a
GTZAN Music Speech
datasets.activeloop.ai
deeplake
Updated Mar 26, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
George Tzanetakis (2022). GTZAN Music Speech [Dataset]. https://datasets.activeloop.ai/docs/ml/datasets/gtzan-music-speech-dataset/
Explore at:
deeplakeAvailable download formats
Dataset updated
Mar 26, 2022
Authors
George Tzanetakis
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The GTZAN Music Speech dataset is a dataset of audio recordings of music and speech. The dataset consists of 120 tracks, each containing 30 seconds of audio. The tracks in the dataset are all 22050Hz Mono 16-bit audio files in .wav format. The dataset is split into a training set and a test set, with 60 tracks in each set. The training set is used to train machine learning models to classify music and speech, and the test set is used to evaluate the performance of these models.
music_genre
huggingface.co
hf-mirror.llyke.com
Updated Oct 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CCMUSIC (2023). music_genre [Dataset]. https://huggingface.co/datasets/ccmusic-database/music_genre
Explore at:
Dataset updated
Oct 14, 2023
Dataset provided by
Collectors' Choice Music
Authors
CCMUSIC
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
The raw dataset comprises approximately 1,700 musical pieces in .mp3 format, sourced from the NetEase music. The lengths of these pieces range from 270 to 300 seconds. All are sampled at the rate of 48,000 Hz. As the website providing the audio music includes style labels for the downloaded music, there are no specific annotators involved. Validation is achieved concurrently with the downloading process. They are categorized into a total of 16 genres.

For the pre-processed version, audio is cut into an 11.4-second segment, resulting in 36,375 files, which are then transformed into Mel, CQT and Chroma. In the end, the data entry has six columns: the first three columns represent the Mel, CQT, and Chroma spectrogram slices in .jpg format, respectively, while the last three columns contain the labels for the three levels. The first level comprises two categories, the second level consists of nine categories, and the third level encompasses 16 categories. The entire dataset is shuffled and split into training, validation, and test sets in a ratio of 8:1:1. This dataset can be used for genre classification.
h
Music-Dataset
huggingface.co
Updated Aug 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eddyson De La Torre (2023). Music-Dataset [Dataset]. https://huggingface.co/datasets/Eddycrack864/Music-Dataset
Explore at:
Dataset updated
Aug 14, 2023
Authors
Eddyson De La Torre
License
https://choosealicense.com/licenses/openrail/https://choosealicense.com/licenses/openrail/
Description
Eddycrack864/Music-Dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
d
Festejo Dataset for AI-Generated Music (Machine Learning (ML) Data)
datarade.ai
.json, .csv, .xls
Updated Apr 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rightsify (2024). Festejo Dataset for AI-Generated Music (Machine Learning (ML) Data) [Dataset]. https://datarade.ai/data-categories/copyright-data
Explore at:
.json, .csv, .xlsAvailable download formats
Dataset updated
Apr 12, 2024
Dataset authored and provided by
Rightsify
Area covered
Sint Maarten (Dutch part), Heard Island and McDonald Islands, Guernsey, Indonesia, Palestine, Bolivia (Plurinational State of), Costa Rica, Armenia, Ghana, Nigeria
Description
"Festejo" is a meticulously curated AI-generated music dataset that captures the essence of Festejo, encapsulating the spirited percussions, infectious rhythms, and celebratory melodies that define this genre.

With an array of expertly crafted samples, this serves as a captivating playground for machine learning applications, offering a unique opportunity to explore and infuse your compositions with the dynamic essence of Afro-Peruvian heritage.

Each meticulously generated sample allows for engagement with the intricate tapestry of Festejo's rhythms and melodies, inspiring to create compositions that honor its cultural roots while embracing the limitless possibilities of AI-generated music.

This exceptional AI Music Dataset encompasses an array of vital data categories, contributing to its excellence. It encompasses Machine Learning (ML) Data, serving as the foundation for training intricate algorithms that generate musical pieces. Music Data, offering a rich collection of melodies, harmonies, and rhythms that fuel the AI's creative process. AI & ML Training Data continuously hone the dataset's capabilities through iterative learning. Copyright Data ensures the dataset's compliance with legal standards, while Intellectual Property Data safeguards the innovative techniques embedded within, fostering a harmonious blend of technological advancement and artistic innovation.

This dataset can also be useful as Advertising Data to generate music tailored to resonate with specific target audiences, enhancing the effectiveness of advertisements by evoking emotions and capturing attention. It can be a valuable source of Social Media Data as well. Users can post, share, and interact with the music, leading to increased user engagement and virality. The music's novelty and uniqueness can spark discussions, debates, and trends across social media communities, amplifying its reach and impact.
a
FMA: A Dataset For Music Analysis
academictorrents.com
bittorrent
Updated Nov 4, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Defferrard, Michael and Benzi, Kirell and Vandergheynst, Pierre and Bresson, Xavier (2019). FMA: A Dataset For Music Analysis [Dataset]. https://academictorrents.com/details/dba20c45d4d6fa6453a4e99d2f8a4817893cfb94
Explore at:
bittorrentAvailable download formats
Dataset updated
Nov 4, 2019
Dataset authored and provided by
Defferrard, Michael and Benzi, Kirell and Vandergheynst, Pierre and Bresson, Xavier
License
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Description
We introduce the Free Music Archive (FMA), an open and easily accessible dataset suitable for evaluating several tasks in MIR, a field concerned with browsing, searching, and organizing large music collections. The community s growing interest in feature and end-to-end learning is however restrained by the limited availability of large audio datasets. The FMA aims to overcome this hurdle by providing 917 GiB and 343 days of Creative Commons-licensed audio from 106,574 tracks from 16,341 artists and 14,854 albums, arranged in a hierarchical taxonomy of 161 genres. It provides full-length and high-quality audio, pre-computed features, together with track- and user-level metadata, tags, and free-form text such as biographies. We here describe the dataset and how it was created, propose a train/validation/test split and three subsets, discuss some suitable MIR tasks, and evaluate some baselines for genre recognition. Code, data, and usage examples are available at
g
Image Music Affective Correspondence Dataset (IMAC)
gaurav22verma.github.io
csv
Updated Mar 6, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gaurav Verma, Eeshan Gunesh Dhekane, and Tanaya Guha (2019). Image Music Affective Correspondence Dataset (IMAC) [Dataset]. https://gaurav22verma.github.io/IMAC_Dataset.html
Explore at:
csvAvailable download formats
Dataset updated
Mar 6, 2019
Dataset authored and provided by
Gaurav Verma, Eeshan Gunesh Dhekane, and Tanaya Guha
Description
Affective correspondence between 3,812 songs and over 85,000 images. To facilitate the study of crossmodal emotion analysis, we constructed a large scale database, which we call the Image-Music Affective Correspondence (IMAC) database. It consists of more than 85,000 images and 3,812 songs (approximately 270 hours of audio). Each data sample is labeled with one of the three emotions: positive, neutral and negative. The IMAC database is constructed by combining an existing image emotion database (You et al., 2016) with a new music emotion database curated by us (Verma et al., 2019).
d
Country Music Dataset for AI-Generated Music (Machine Learning (ML) Data)
datarade.ai
.json, .csv, .xls
Updated Jul 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rightsify (2023). Country Music Dataset for AI-Generated Music (Machine Learning (ML) Data) [Dataset]. https://datarade.ai/data-products/country-music-dataset-for-ai-generated-music-rightsify
Explore at:
.json, .csv, .xlsAvailable download formats
Dataset updated
Jul 21, 2023
Dataset authored and provided by
Rightsify
Area covered
Congo (Democratic Republic of the), Philippines, United Kingdom, Congo, Ethiopia, Italy, Brunei Darussalam, Belize, Luxembourg, Czech Republic
Description
"Country Music" is an AI music dataset meticulously crafted to revolutionize the field of music generation. This extensive collection encapsulates a myriad of acoustic guitar strumming patterns, expressive vocal styles, twangy banjo melodies, soulful fiddle tunes, and the iconic sound of the pedal steel guitar, all performed with the genuine essence of country music.

The comprehensive metadata associated with each sample provides an immersive context, encompassing details about the instruments utilized, strumming techniques, chord progressions, tempos, and dynamic levels.

"Country Music" is a transformative resource that empowers machine learning applications to delve into the heart of country music, enabling the generation of highly authentic and emotive compositions that accurately reflect the intricacies of this genre.

This exceptional AI Music Dataset encompasses an array of vital data categories, contributing to its excellence. It encompasses Machine Learning (ML) Data, serving as the foundation for training intricate algorithms that generate musical pieces. Music Data, offering a rich collection of melodies, harmonies, and rhythms that fuel the AI's creative process. AI & ML Training Data continuously hone the dataset's capabilities through iterative learning. Copyright Data ensures the dataset's compliance with legal standards, while Intellectual Property Data safeguards the innovative techniques embedded within, fostering a harmonious blend of technological advancement and artistic innovation.

This dataset can also be useful as Advertising Data to generate music tailored to resonate with specific target audiences, enhancing the effectiveness of advertisements by evoking emotions and capturing attention. It can be a valuable source of Social Media Data as well. Users can post, share, and interact with the music, leading to increased user engagement and virality. The music's novelty and uniqueness can spark discussions, debates, and trends across social media communities, amplifying its reach and impact.
Music consumption in the U.S. in 2021, by genre & format
statista.com
metroremit.net
+16more
Updated Aug 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2023). Music consumption in the U.S. in 2021, by genre & format [Dataset]. https://www.statista.com/statistics/502908/music-consumption-genre-format-usa/
Explore at:
Dataset updated
Aug 29, 2023
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
United States
Description
Physical albums accounted for less than ten percent of music consumption in the United States in 2021. While fans of jazz or rock music were most likely to listen to artists via physical formats, the vast majority of listeners across all genres enjoyed their favorite tracks through on-demand streaming services.

Stream it, just stream it

A closer look at the distribution of streamed music consumption reveals that R&B and hip-hop is the most streamed music genre in the United States. Roughly 30 percent of all streams came from this genre in 2021, which can at least partially be explained by its popularity among young and digitally savvy listeners. The overall demand for streamed audio content is growing rapidly each year, and in 2021, the number of paid music streaming subscribers in the U.S. surpassed a record 82 million. That same year, nearly one trillion music streams were amassed in the United States alone, signaling a bright future for services like Spotify or Amazon Music. So which artists were audiences listening to?

Most popular songs

In 2021, "Levitating" by Dua Lipa was the most popular song in the U.S. based on audio streams. The track from her second studio album "Future Nostalgia" was streamed a total of 804 million times while also ranking among the top three best-selling digital music singles worldwide. Looking at the artists with the most streams overall that year, Olivia Rodrigo topped the charts with a combined 1.47 billion streams for her songs "drivers license" and "good 4 u".
Ways to discover new music worldwide 2022, by age
statista.com
20minutesfr.net
Updated Jul 22, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2022). Ways to discover new music worldwide 2022, by age [Dataset]. https://www.statista.com/statistics/1273351/new-music-discovery-by-age-worldwide/
Explore at:
Dataset updated
Jul 22, 2022
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Nov 2021
Area covered
Worldwide
Description
According to a study on music consumption worldwide in 2022, younger generations tended to find new songs via music apps and social media, while older generations also used the radio as a format to discover new audio content.
h
music-fingerprint-dataset
huggingface.co
Updated Sep 22, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aditya Kumar (2022). music-fingerprint-dataset [Dataset]. https://huggingface.co/datasets/arch-raven/music-fingerprint-dataset
Explore at:
Dataset updated
Sep 22, 2022
Authors
Aditya Kumar
Description
Neural Audio Fingerprint Dataset

(c) 2021 by Sungkyun Chang https://github.com/mimbres/neural-audio-fp This dataset includes all music sources, background noise and impulse-reponses (IR) samples that have been used in the work "https://arxiv.org/abs/2010.11910">"Neural Audio Fingerprint for High-specific Audio Retrieval based on Contrastive Learning".

Format:

16-bit PCM Mono WAV, Sampling rate 8000 Hz

Description:

/… See the full description on the dataset page: https://huggingface.co/datasets/arch-raven/music-fingerprint-dataset.
m
Music Topics and Metadata
data.mendeley.com
Updated Aug 22, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luan Moura (2020). Music Topics and Metadata [Dataset]. http://doi.org/10.17632/3t9vbwxgr5.1
Explore at:
Unique identifier
https://doi.org/10.17632/3t9vbwxgr5.1
Dataset updated
Aug 22, 2020
Authors
Luan Moura
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset provides a list of lyrics from 1950 to 2019 describing music metadata as sadness, danceability, loudness, acousticness, etc. We also provide some informations as lyrics which can be used to natural language processing.

The audio data was scraped using Echo Nest® API integrated engine with spotipy Python’s package. The spotipy API permits the user to search for specific genres, artists,songs, release date, etc. To obtain the lyrics we used the Lyrics Genius® API as baseURL for requesting data based on the song title and artist name.
n
Orin: The Nigerian Music Dataset
narcis.nl
Updated Jan 11, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Folorunso, S (via Mendeley Data) (2020). Orin: The Nigerian Music Dataset [Dataset]. http://doi.org/10.17632/db33wxb252.1
Explore at:
Unique identifier
https://doi.org/10.17632/db33wxb252.1
Dataset updated
Jan 11, 2020
Dataset provided by
Data Archiving and Networked Services (DANS)
Authors
Folorunso, S (via Mendeley Data)
Area covered
Nigeria
Description
Music Information Retrieval (MIR) is the task of extracting higher-level information such as genre, artist or instrumentation from music [1]. Music genre classification is an important area of MIR and a rapidly evolving research area. Presently, slight research work has been done on automatic music genre classification of Nigerian songs. Hence, this study presents a new music dataset named ORIN which mainly cores traditional Nigerian songs of four genres (Fuji, Juju, Highlife and Apala). The ORIN dataset consists of 208 Nigerian songs downloaded from the internet. Timbral Texture Features were then mined from one or two 30 second segments from each song using the Librosa [2] python library. Songs features were mined straight from the digital song files. Each piece of song was sampled at 22.5Khz 16-bit mono audio files. The song signal was then shared into frames of 1024 samples with 50% overlay between successive frames. A Hamming window is applied without pre-emphasis for each frame. Then, 29 averaged Spectral energies are obtained from a bank of 29 Mel triangular filters followed by a DCT, yielding 20 Mel frequency Cepstrum Coefficients (MFCC). The mean and standard deviation of the values taken across frames is considered as the representative final feature that is fed to the model for each of the spectral features. These features consist of the time (FFT) and frequency (MFCC) feature sets of the dataset domains.
d
Folk Music Dataset for AI-Generated Music (Machine Learning (ML) Data)
datarade.ai
.json, .csv, .xls
Updated Jul 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rightsify (2023). Folk Music Dataset for AI-Generated Music (Machine Learning (ML) Data) [Dataset]. https://datarade.ai/data-products/folk-music-dataset-for-ai-generated-music-rightsify
Explore at:
.json, .csv, .xlsAvailable download formats
Dataset updated
Jul 21, 2023
Dataset authored and provided by
Rightsify
Area covered
Sudan, Suriname, Palau, South Georgia and the South Sandwich Islands, Zimbabwe, Micronesia (Federated States of), Nauru, Sierra Leone, Argentina, Senegal
Description
"Folk Music" is an exceptional AI music dataset meticulously curated to preserve and celebrate the timeless beauty of folk music. This comprehensive collection focuses exclusively on folk music, capturing its essence through a rich assortment of melodies, rhythms, and cultural expressions.

By leveraging the diversity and quality of the folk music samples provided, "Folk Music" empowers machine learning applications to generate authentic and evocative folk compositions.

The detailed metadata associated with each sample in the dataset provides a wealth of contextual information, including instrument types, playing techniques, specific melodies, tempos, and dynamics. This allows for a deeper understanding and exploration of the intricate dynamics that make folk music unique.

This exceptional AI Music Dataset encompasses an array of vital data categories, contributing to its excellence. It encompasses Machine Learning (ML) Data, serving as the foundation for training intricate algorithms that generate musical pieces. Music Data, offering a rich collection of melodies, harmonies, and rhythms that fuel the AI's creative process. AI & ML Training Data continuously hone the dataset's capabilities through iterative learning. Copyright Data ensures the dataset's compliance with legal standards, while Intellectual Property Data safeguards the innovative techniques embedded within, fostering a harmonious blend of technological advancement and artistic innovation.

This dataset can also be useful as Advertising Data to generate music tailored to resonate with specific target audiences, enhancing the effectiveness of advertisements by evoking emotions and capturing attention. It can be a valuable source of Social Media Data as well. Users can post, share, and interact with the music, leading to increased user engagement and virality. The music's novelty and uniqueness can spark discussions, debates, and trends across social media communities, amplifying its reach and impact.

Facebook

Twitter

Click to copy link

Link copied

Cite

Google Research (2023). MusicCaps [Dataset]. https://www.kaggle.com/datasets/googleai/musiccaps

MusicCaps

5.5k high-quality music captions written by musicians

Explore at:

Dataset updated

Jan 25, 2023

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Google Research

License

Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically

Description

The MusicCaps dataset contains 5,521 music examples, each of which is labeled with an English aspect list and a free text caption written by musicians.

An aspect list is for example "pop, tinny wide hi hats, mellow piano melody, high pitched female vocal melody, sustained pulsating synth lead".
The caption consists of multiple sentences about the music, e.g., "A low sounding male voice is rapping over a fast paced drums playing a reggaeton beat along with a bass. Something like a guitar is playing the melody along. This recording is of poor audio-quality. In the background a laughter can be noticed. This song may be playing in a bar."

The text is solely focused on describing how the music sounds, not the metadata like the artist name.

The labeled examples are 10s music clips from the AudioSet dataset (2,858 from the eval and 2,663 from the train split).

Please cite the corresponding paper, when using this dataset: http://arxiv.org/abs/2301.11325 (DOI: 10.48550/arXiv.2301.11325)

Clear search

Close search

Google apps

Main menu

MusicCaps

Music Dataset: Lyrics and Metadata from 1950 to 2019

FMA: A Dataset For Music Analysis

Source:

Data Set Information:

MuMu: Multimodal Music Dataset

Music Dataset: Lyrics and Metadata from 1950 to 2019

MusicCaps Dataset

Music and emotion dataset (Primary Musical Cues)

GTZAN Music Speech

music_genre

Music-Dataset

Festejo Dataset for AI-Generated Music (Machine Learning (ML) Data)

FMA: A Dataset For Music Analysis

Image Music Affective Correspondence Dataset (IMAC)

Country Music Dataset for AI-Generated Music (Machine Learning (ML) Data)

Music consumption in the U.S. in 2021, by genre & format

Ways to discover new music worldwide 2022, by age

music-fingerprint-dataset

Music Topics and Metadata

Orin: The Nigerian Music Dataset

Folk Music Dataset for AI-Generated Music (Machine Learning (ML) Data)

MusicCapsSee More Versions

5.5k high-quality music captions written by musicians

MusicCaps