Consider this sort of a public-facing list of datasets I’ve found interesting, have played with or want to play with.
List of Datasets
Peer to peer credit marketplace Lending Club publishes data on issued and declined loans. https://www.lendingclub.com/info/download-data.action
World Health Organisation
The WHO publishes many interesting datasets at http://www.who.int/research/en/. They don’t however do a great job of linking to the raw datasets: http://www.who.int/healthinfo/statistics/mortality_rawdata/en/ is a comprehensive dataset providing mortality rates for all reporting countries, but difficult to find from the navigation.
New York Times
The New York Times has a fairly comprehensive open api, documented at http://developer.nytimes.com/docs
The Chicago public cycle hire scheme (akin to New York’s Citibike, London’s
Barclays Boris Bike) published data on 750 000 trips made for their data challenge. http://divvybikes.com/datachallenge
Outpan aims to provide a single database for turning barcodes into product information. Not extremely complete. http://www.outpan.com/index.php
Under the efforts of transparency, a dataset containing information around usage of Medicare. Could make a complement to some of the other medical datasets available. http://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Medicare-Provider-Charge-Data/Physician-and-Other-Supplier.html
List of Lists of Datasets
- http://rs.io/2014/05/29/list-of-data-sets.html – 100+ Interesting Data Sets for Statistics. Pretty much all of these are really great!
Looking for interesting data sets? Here’s a list of more than 100 of the best stuff, from dolphin relationships to political campaign donations to death row prisoners.
- http://bagrow.com/dsv/datasets.html – contains links some datasets and some lists of datasets. Perhaps this should come under the section lists of lists of lists of datasets.