Public
Access a wide range of open-source datasets for research and projects.
Home
About
Courses
Datasets
Blog
Policy
Contact
Common Crawl
- Massive archive of web data for NLP and machine learning.
Data.gov
- Open data from the U.S. government.
Datahub.io
- Platform for finding, sharing, and publishing data.
Earth Data
- NASA's repository of Earth science data.
Eurostat
- European social, economic, and population data.
Freesound
- Collaborative database of audio samples and music data.
Gapminder
- Global development data on health, wealth, and population.
Google Dataset Search
- Search engine for datasets across the web.
Hugging Face Datasets
- NLP and machine learning datasets.
IMDb Datasets
- Movie, TV show, and actor information.
Kaggle Datasets
- Vast collection of public datasets.
NOAA Climate Data Online
- Weather and climate data.
Our World in Data
- Research data on global problems.
Quandl
- Financial, economic, and alternative datasets.
TensorFlow Datasets
- Ready-to-use datasets for machine learning.
UCI Machine Learning Repository
- High-quality datasets for machine learning.
UK Data Service
- UK’s largest collection of social and economic data.
USGS Earth Explorer
- Earth science data including satellite imagery.
World Bank Open Data
- Global development data.
Specialized Datasets
NCI Cancer Data
- Medical datasets for cancer research.
Energy Star Data
- Electricity and energy efficiency datasets.
FCC Data
- Telecommunication data from the U.S. Federal Communications Commission.
COCO Dataset
- Image dataset for object detection and segmentation in machine learning.