SQLite Data Starter Packs | Public Affairs Data Journalism at Stanford | Fall 2016

2 min read Original article ↗

This is a collection of public datasets conveniently packaged as SQLite databases to practice on. You don’t have to worry about the data cleaning/import process, just download the SQLite database files and query them from your favorite SQLite client.

Dataset Size Tables
SimpleFolks for Simple SQL 0.01 MB 3
American Community Survey 1-Year Data for 2015 0.25 MB 3
M3.0+ Earthquakes in the Contiguous U.S., 1995 through 2015 52.3 MB 1
S.F. Food Inspections (LIVES) 16.4 MB 1
Census 2000 Surnames 23.3 MB 1
Dallas Police Officer-Involved Shootings 0.4 MB 3
Florida Death Row Roster 0.1 MB 1
Salaries of City Officials from the California Peninsula 65.9 MB 1
SFPD Incidents, 2012 through 2015 98.3 MB 1
San Francisco Restaurant Health Inspections 9.8 MB 3
Social Security Administration Baby Names, 1980 through 2015 81.0 MB 1
Social Security Administration Baby Names 2015 for All States 11.4 MB 1
California School SAT Performance and Poverty Data 14.8 MB 3
Gendered Baby Names 2015 19.6 MB 1
Gender assessment of Hollywood Reporter's 2016 Power 100 Rankings 1.6 MB 2

About the datasets

SimpleFolks for Simple SQL

As a way to simplify learning new SQL syntax, this is a very simple, very small database of people who just go by their first names, and live in a world in which they own pets and homes.

Related links

Gendered Baby Names 2015

This dataset is a transformation of the data in the 2015 Social Security babyname dataset. Instead of having a M and F entry for Leslie, this dataset has one entry for every name, with two additional fields that specify what that name’s majority gender is (and by how much).

This is a useful dataset for joining on other tables with names to get a gender calculation. This dataset includes name data for each state and nationwide.

Related links