Skip to content
Jonathan Sandoe edited this page Nov 26, 2018 · 1 revision

NHANES data portal

This plugin supplies data from the NHANES, the National Health and Nutrition Examination Survey. This is a big-deal survey given every year to a carefully-chosen sample of Americans. The data here are from 2003, but it would be cool to have more.

As with the ACS data, users get to choose attributes and a sample size. But this time, the attributes are health-y, so they include things like height, weight, blood pressure, arm length, cholesetrol, glucose, and how many sex partners you've had.

The data are stored in a MySQL database, so we need php to access that. Details appear in our page about that process..

Files

  • nhanes.html: The usual: the layout plus links to the javascript files.
  • nhanes.js: The main controller, also defines the global nhanes. whence is defined in here.
  • nhanes.css: Styles for the page.
  • nhanes.constants.js: Constants. These include version and the php paths.
  • nhanes.ui.js: In charge of updating the UI so that visibility and text reflect the state of the user's choices. This also includes creating the html for stuff like lists of checkboxes for the attributes, etc.
  • nhanes.userActions.js: Has methods invoked by user actions such as changeAttributeCheckbox() and pressGetCasesButton()
  • nhanes.DBconnect.js: This is the portal to the php. It constructs the commands that will be sent as POST variables, which nhanes.php will process to make queries for the DB and return here with the results (e.g., arrays of objects that represent cases).
  • nhanes.CODAPconnect.js: This file is in charge of sending the data we receive on to CODAP, and all other CODAP interactions. These include making new attributes in the dataset if the user has selected attributes for which we have no data before; also, automatically opening the case table when the first data are received.

php

These are in the main folder

  • nhanes.php: The main php file, where the commands (telling what to do, how many cases to get, what attributes, etc.) are converted into MySQL queries, using the PDO API.
  • nhanes.establishCredentials.php: Contains the path to the secret credentials file. Defines values for the database password, username, etc.

MySQL Tables

This is more complex than the set for acs. The problem is that there is not just one table to hold the actual data; instead, the data are segregated into different files, based, apparently, on which people get tested for what. For example, children are not asked about sexual partners. Apparently, the demog (demographics) table is a master that contains everybody. The tables are linked by unique person IDs stored in a column called SEQN, presumably, sequence number.

As a consequence, the queries that MySQL ultimately sees are more complicated than the usual ones we make. Here is one that asks for sex, age, diastolic blood pressure, and cholesterol:

SELECT t1.RIAGENDR,t1.RIDAGEYR,t3.BPXDAR,t4.LBXSCH 
     FROM demog as t1,bp as t3,biochem as t4 
     WHERE t1.SEQN = t3.SEQN AND t1.SEQN = t4.SEQN ORDER BY RAND( ) LIMIT 10

The elements of that query are constructed in nhanes.DBconnect.getCasesFromDB().

data tables

  • demog: 10122 records, perhaps the entire 2003 sample. Has demographic information, encoded. We will speak of decoding soon.
  • biochem: biochemistry, aka bloodwork. 6,990 records, so not everybody got these tests.
  • bmx: Not bikes. Body measurements. 9041 records, so almost everyone.
  • bp: Blood pressure, and other related quantities learned in an exam rather than in the lab: pulse, whether you've just had coffee, etc.
  • sexactivity: Like it sounds. 2,993 records, apparently only persons between 20 and 60.

tables about the data

  • metatable: a table listing the five tables above.
  • varlist: 154 records. A table with one record for each variable. One field is TABLEID, which links to metatable and tells you which table the data for that variable come from. Records also include units, descriptions from the documentation, and NAMEOUT, which is the name we give to the variable when we display it in CODAP. For example, in this table, you learn that the variable named RIAGENDR, we will call Sex.

decoding table

The data in the data tables are impenetrable for categorical values: coded as integers, mostly. So to display these values we need to decode them.

  • decoder: Three columns: VARNAME, CODE, and RESULT. So you will see, for example, {VARNAME : RIAGENDR, CODE : 1, RESULT : Male}. When we output a value, if the varname/code combination exists in this table, we substitute RESULT for CODE.

And also,

  • searches: One record for every search a user has made. Currently 11,948, but always increasing. Could be fodder for an interesting study!
Clone this wiki locally