Skip to content
Evan W. Patton edited this page Apr 20, 2014 · 3 revisions

File

Schema

Database Schema

Requirements

  • PostgreSQL 9.3
  • PostGIS 2.1
  • 5.7 GB disk space
  • 1 GB RAM (possibly less)

Release Notes

  • Converted ERSI Shape file into PostGIS geometry
  • Created model_year table to aggregate years into a smaller table
  • Created make table to aggregate car manufacturers into a singlular table
  • Created model table to aggregate car manufacturer/model pairs into a singular table
  • Created vehicle table to differentiate vehicles appearing in more than one Regulation and Emissions records
  • Created municipality table from grid_250m_shape table
  • Added foreign key relationships between different tables
  • Added indices for make/model names, VINs, muni_ids, g250m_ids
  • Installation instructions

Installation

These instructions have not been fully vetted, so use at your own risk.

Windows

  1. Download Postgres 9.3.4 from EnterpriseDB.
  2. Download PostGIS 2.1.1 from OSGeo.
  3. Install Postgres using the installer.
  4. Install PostGIS using the installer.
  5. Launch pgAdmin III from the Start Menu.
  6. Select the PostgreSQL server instance and select "Tools => Connect" from the menu. Enter the password you set during installation.
  7. From the Edit menu, select New Object > New Database...
  8. Enter a name for the database, e.g. massdrivedata.
  9. Set the owner to postgres.
  10. Close pgAdmin.
  11. Launch SQL Shell (psql) from the Start Menu.
  12. Log in to your newly created database.
  13. Load the SQL file using the \i command: \i 'C:/path/to/downloaded/data.sql'. You might need to put the file into C:/Temp or some similar folder to make Windows let you load it. E.g. \i 'C:\\TEMP\\massdrivedata-1.0.sql'
  14. To quit, type \q

Mac OS X

Using macports:

sudo port install postgresql93 postgresql93-server [email protected]+postgresql93
createdb massdrivedata
bunzip2 -c massdrivedata-1.0.sql.bz2 | psql93 -d massdrivedata

You may also want to install pgAdmin for a GUI for accessing the database.

Debian or Ubuntu

sudo /bin/sh -c "echo deb http://apt.postgresql.org/pub/repos/apt/ `lsb_release -c`-pgdg main >> /etc/apt/sources.list.d/pgdg.list"
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
sudo apt-get update
sudo apt-get install postgresql-9.3 postgresql-9.3-postgis-2.1
wget http://www.evanpatton.com/projects/massdrive/massdrivedata-1.0.sql.bz2
sudo -u postgres createdb massdrivedata
bunzip2 -c massdrivedata-1.0.sql.bz2 | sudo -u postgresql-9.3 psql -d massdrivedata

Sample Queries

  • get number of toyotas in dataset
select count(*)
from vehicle join make using (make_id)
where make_name = 'Toyota';
count
678247
  • get sorted list of percentage of vehicles by manufacturer
select make_name, count(vin_id), (100.0*cast(count(vin_id) as real)/(select count(*) from vehicle)) as percentage
from vehicle join make using (make_id)
group by make_name
order by percentage desc;
make_name count percentage
Toyota 678247 15.0526272804304
Ford 590035 13.0949004380539
Honda 440999 9.78728041265576
Chevrolet 405376 8.9966838576975
Nissan 233808 5.18900146876119
Dodge 205374 4.5579534816831
Jeep 168626 3.74238931803585
Subaru 120538 2.67515165880353
Gmc 110818 2.45943151973063
Hyundai 108722 2.41291409056429
Volkswagen 91616 2.0332732779119
Chrysler 85022 1.88692980084948
Volvo 83531 1.85383939680033
Buick 81972 1.81923983951487
Mercury 81585 1.81065098212586
Bmw 81180 1.80166264299782
Lexus 72906 1.6180342036265
Mazda 71363 1.58378974122017
Mercedes 66898 1.48469607651229
Pontiac 64699 1.43589272406154
Acura 60344 1.3392403366477
Unknown 56289 1.24924597821759
Saturn 52733 1.17032614133043
Cadillac 46589 1.0339697077436
Kia 45094 1.00079052997467
Audi 41331 0.917276653088726
Mitsubishi 38574 0.856089366728231
Harley 36968 0.82044671823532
Lincoln 36220 0.803846032635883
Saab 33251 0.737953739126884
Infiniti 32278 0.716359531789647
Oldsmobile 31048 0.689061612956347
Suzuki 14894 0.330548945612337
Plymouth 14640 0.32491181440611
Scion 14026 0.311285048419406
Isuzu 13100 0.290733932289621
Intl 11780 0.261438604761201
Land Rover 10962 0.243284379065559
Freightliner 9714 0.215586978493235
Mini 9263 0.205577741587691
Porsche 8651 0.191995362460879
Jaguar 7220 0.160236564208478
Yamaha 6815 0.15124822508044
Geo 6668 0.147985790878412
Kawasaki 6439 0.14290349542083
Hummer 3409 0.0756574026851387
Triumph 1142 0.0253448969980723
Workhorse 857 0.0190197694635271
Polaris 787 0.017466229367323
Smart 715 0.0158683024112274
Daewoo 685 0.0152024995128542
Eagle 648 0.0143813426048606
Kw 594 0.0131828973877889
Mgb 495 0.0109857478231574
Alfa Romeo 406 0.00901053255798366
Ferrari 399 0.00885517854836326
Mci 364 0.00807840850026122
snip...
  • compute average odometer reading for every municipality for 2011 inspections sorted descending by odometer
select muni_name, avg_odom
from municipality join
  -- this inner table could be replaced by a view
  ( select muni_id, avg(start_odom) as avg_odom
    from rae_public
    where insp_year = 2011
    group by muni_id) as t using (muni_id)
order by avg_odom desc;
muni_name avg_odom
GOSNOLD 156637.000000000000
HANCOCK 109736.098726114650
WHATELY 107538.352830188679
LAWRENCE 102574.580867518494
WENDELL 101330.714975845411
NEW SALEM 98314.463709677419
HATFIELD 97568.341673182174
PLAINFIELD 96087.983146067416
FREETOWN 94097.785403576607
ASHFIELD 93419.001600000000
CHARLEMONT 92678.308080808081
PETERSHAM 92125.079019073569
BROOKFIELD 91760.321407624633
LEYDEN 91558.610972568579
SHELBURNE 91437.688435374150
PLYMPTON 90992.056917688266
HAWLEY 90856.323809523810
WEST BRIDGEWATER 90798.392344497608
WAREHAM 90245.044320342205
ROYALSTON 90236.733333333333
PHILLIPSTON 90054.643973214286
ORANGE 89918.500492287496
SHUTESBURY 89347.190397350993
BUCKLAND 89312.807106598985
WARWICK 89261.572164948454
AYER 88158.032517807371
BROCKTON 88094.455335740995
NEW BRAINTREE 88093.619607843137
TEMPLETON 87865.369844020797
LAKEVILLE 87664.127798507463
NORTHFIELD 87625.521702127660
HOLLAND 87602.257275902212
MIDDLEBOROUGH 87598.072782819660
HEATH 87379.780000000000
WARREN 87235.190740740741
WORTHINGTON 87230.616519174041
WALES 87182.145794392523
CONWAY 87164.664133738602
SUNDERLAND 87087.528594771242
BRIMFIELD 86973.943371943372
COLRAIN 86955.183673469388
GRANVILLE 86720.506566604128
CUMMINGTON 86622.181451612903
WILLIAMSBURG 86476.799342105263
ROCHESTER 86285.817213114754
BERKLEY 86230.205332409972
ERVING 86153.606965174129
PALMER 86087.850258531541
LEVERETT 86083.598601398601
WINCHENDON 86062.786084381939
PEPPERELL 86052.366598430570
SPRINGFIELD 86028.343141592920
BERNARDSTON 85777.680424528302
MONSON 85742.397087378641
MONTAGUE 85641.875882571535
AVON 85631.151141552511
HOLYOKE 85564.579687500000
snip...