Skip to content

Load the Data into PostGIS

atogle edited this page May 2, 2011 · 2 revisions

Setup the Database and Table

Cool, so let's get our database setup while those files are parsing. These directions assume that Postgres is running on same host. You can specify the Postgres host, port, and other things. Typing man psql is very helpful.

  1. Set the user as your Postgres user. This is so that we don't need to specify our user in our scripts over and over again. sudo su myusername
  2. Create a new spatial database. Most Postgres/PostGIS installations will create a template of a spatial database to make it easy to create new ones (like we need to do now). This template is often called template_postgis, but not always so it's always good to check. You can create a new database using the spatial template via the command line by typing createdb -T template_postgis mydbname. More helpful info on createdb can be found here and PostGIS table setup here.
  3. Great. Let's create a new table in our database. We created a handy SQL script to do all of that for you. First go from the ruby directory to the sql directory. cd ../sql
  4. Now run the SQL script by typing psql -d mydbname -f create_census_tables.sql. Nice, now we have a table to populate, complete with indexes and a spatial column.

Section Summary

sudo su myusername
createdb -T template_postgis mydbname
cd ../sql
psql -d mydbname -f create_census_tables.sql

Populate the Table

Now let's populate our table with delicious 2010 census! We're going to do this by doing a \copy of our csv files directly into the database. This may take an hour or so, but believe me that this is much faster than doing individual INSERTs.

  1. We wrote a handle shell script to automatically iterate through your csv files and \copy them into the database. It lives in the sh directory so let's go there. cd ../sh
  2. We also want to make sure we have the right permissions so that we can execute this file. I'm running this on my local machine so the easiest thing is to use chmod +x db_copy_merged_csv.sh.
  3. Cool. Now let's run it. This will probably prompt you for your Postgres user password once per state, which is annoying, but I haven't fixed it yet. But hey, it's open source so YOU can fix it too. =) ./db_copy_merged_csv.sh
  4. Awesome! You've been really patient and all of the raw data has been loaded into the database. The last step is to set the geometry column based on the interpolated latitude and longitude columns (INTPTLAT and INTPTLON). Go back to the sql directory by typing cd ../sql and then run psql -d mydbname -f set_the_geom.sql.

Section Summary

cd ../sh
chmod +x db_copy_merged_csv.sh
./db_copy_merged_csv.sh
cd ../sql
psql -d mydbname -f set_the_geom.sql

It's Over!

Sweet! You're done! You now have a beautifully parsed PostGIS database to query at your pleasure. For example, log in and run this query to find the nearest city to lon/lat (-122.4, 37.788): SELECT *, ST_Distance_Sphere("the_geom", ST_SetSRID(ST_Point(-122.4, 37.788), 4269)) AS "dist" FROM census WHERE "SUMLEV" = '160' ORDER BY "dist" LIMIT 1