Saved datasets
Last updated
Download format
Croissant
Croissant is a format for Machine Learning datasets
Learn more about this at mlcommons.org/croissant.
Usage rights
License from data provider
Please review the applicable license to make sure your contemplated use is permitted.
Topic
Provider
Free
Cost to access
Described as free to access or have a license that allows redistribution.
100+ datasets found
  1. NIST Statistical Reference Datasets - SRD 140

    • catalog.data.gov
    • data.nist.gov
    • +2more
    Updated Jul 29, 2022
  2. k

    Billionaires-Statistics-Dataset

    • kaggle.com
    Updated Feb 8, 2024
  3. Statistics

    • figshare.com
    zip
    Updated Jul 25, 2018
  4. Hydrographic and Impairment Statistics Database: LONG

    • catalog.data.gov
    • catalog.gimi9.com
    • +1more
    Updated Feb 12, 2024
    + more versions
  5. G

    Guinea GN: Children Out of School: Primary: Female

    • ceicdata.com
    + more versions
  6. GLA Code of Statistics - Dataset - data.gov.uk

    • ckan.publishing.service.gov.uk
    Updated Nov 12, 2018
    + more versions
  7. p

    Gender

    • paradise.ca
    • wembley.ca
    • +90more
    Updated Oct 24, 2018
    + more versions
  8. S

    Senegal SN: Government Expenditure per Student: Tertiary: % of GDP per...

    • ceicdata.com
  9. E

    YouTube Creator Statistics 2024 – By Most Popular Channels, Most Viewed...

    • enterpriseappstoday.com
    Updated Jan 12, 2024
  10. E

    30+ Most Shocking Employment Discrimination Statistics For 2023: The...

    • enterpriseappstoday.com
    Updated Oct 6, 2023
  11. UK overseas trade in goods statistics: June 2022

    • gov.uk
    • s3.amazonaws.com
    Updated Aug 12, 2022
    + more versions
  12. Data from: Historical, Demographic, Economic, and Social Data: The United...

    • icpsr.umich.edu
    • channel234.com
    ascii, sas, spss +1
    Updated May 21, 2010
    + more versions
  13. b

    Postmates Revenue and Usage Statistics (2024)

    • businessofapps.com
    Updated Sep 14, 2020
  14. m

    Dataset of attitude towards statistics

    • data.mendeley.com
    Updated Jul 20, 2022
  15. Coronavirus Job Retention Scheme statistics: December 2020

    • gov.uk
    • s3.amazonaws.com
    Updated Jul 29, 2021
  16. s

    BeReal Statistics

    • searchlogistics.com
    Updated Feb 13, 2024
  17. H

    Angola - Subnational Population Statistics

    • data.humdata.org
    • data.amerigeoss.org
    csv, xlsx
    Updated Apr 14, 2022
  18. Construction statistics annual tables

    • ons.gov.uk
    • cy.ons.gov.uk
    xlsx
    Updated Nov 28, 2023
  19. Life expectancy of men at birth in Mexico 2021

    • statista.com
    Updated Aug 10, 2023
  20. o

    Fire service statistics - Datasets - Government of Jersey Open Data

    • opendata.gov.je
    Updated May 16, 2022
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
National Institute of Standards and Technology (2022). NIST Statistical Reference Datasets - SRD 140 [Dataset]. https://catalog.data.gov/dataset/nist-statistical-reference-datasets-srd-140-df30c
Organization logo

NIST Statistical Reference Datasets - SRD 140

Explore at:
Dataset updated
Jul 29, 2022
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description

The purpose of this project is to improve the accuracy of statistical software by providing reference datasets with certified computational results that enable the objective evaluation of statistical software. Currently datasets and certified values are provided for assessing the accuracy of software for univariate statistics, linear regression, nonlinear regression, and analysis of variance. The collection includes both generated and 'real-world' data of varying levels of difficulty. Generated datasets are designed to challenge specific computations. These include the classic Wampler datasets for testing linear regression algorithms and the Simon & Lesage datasets for testing analysis of variance algorithms. Real-world data include challenging datasets such as the Longley data for linear regression, and more benign datasets such as the Daniel & Wood data for nonlinear regression. Certified values are 'best-available' solutions. The certification procedure is described in the web pages for each statistical method. Datasets are ordered by level of difficulty (lower, average, and higher). Strictly speaking the level of difficulty of a dataset depends on the algorithm. These levels are merely provided as rough guidance for the user. Producing correct results on all datasets of higher difficulty does not imply that your software will pass all datasets of average or even lower difficulty. Similarly, producing correct results for all datasets in this collection does not imply that your software will do the same for your particular dataset. It will, however, provide some degree of assurance, in the sense that your package provides correct results for datasets known to yield incorrect results for some software. The Statistical Reference Datasets is also supported by the Standard Reference Data Program.

Search
Clear search
Close search
Google apps
Main menu