-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pre TSDC Adaptation Paper Edits [Paper reproducibility pt. 1] #118
base: main
Are you sure you want to change the base?
Conversation
This script compiles the data treatments performed by Cemal in his analysis notebook that generated many of the charts used in the paper. The purpose of this script is to facilitate reproducibility of the results in our paper by taking in the raw set of trips in csv, and applying all data treatments, then saving the results to be loaded into the analysis notebooks Note that I have not yet had a chance to be sure that this works on the data from TSDC, but it does yield the numbers we quote in terms of participants and trips on an aggregate and program level when run in the raw file Cemal gave me
keeping git up to date as I update the paper, changes are messy but need to be kept
Adding Files with no outputs, and preserving file structure similar to that in the PR that includes the changes beyond these
viz_scripts/PaperVizualizations/Abby/CanBikeCO_DataFiltering.ipynb
Outdated
Show resolved
Hide resolved
"\n", | ||
"print(len(data))\n", | ||
"\n", | ||
"a = data[data['AGE']>100]\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better variable name. Maybe a check with the print statement to state whether its zero or not?
viz_scripts/PaperVizualizations/Abby/CanBikeCO_DataFiltering.ipynb
Outdated
Show resolved
Hide resolved
"data['PINC'] = data['HHINC_NUM'] / data['WORKERS']\n", | ||
"\n", | ||
"# Combine variable categories\n", | ||
"data = data.replace('Gas Car, drove alone', 'Car')\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Identical block of code also in CanBikeCO_Analysis.ipynb
file.
Maybe a providing utility.py
file would be a good idea to keep the common block of code across different notebooks?
"outputs": [], | ||
"source": [ | ||
"# load the data from csv -- useful?\n", | ||
"data = pd.read_csv('trip_program.csv')\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to previous notebook, maybe a better context reflecting variable name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have been slowly working through these review comments as I find the time, thank you so much for taking the time to make such throughout suggestions. It's not ready yet but I think this will be much stronger code once I address the comments -- changes being made on other PR so that I can only make them once
viz_scripts/PaperVizualizations/Abby/CanBikeCO_DataFiltering.ipynb
Outdated
Show resolved
Hide resolved
viz_scripts/PaperVizualizations/Abby/CanBikeCO_DataFiltering.ipynb
Outdated
Show resolved
Hide resolved
viz_scripts/PaperVizualizations/Abby/CanBikeCO_DataFiltering.ipynb
Outdated
Show resolved
Hide resolved
viz_scripts/PaperVizualizations/Abby/CanBikeCO_DataFiltering.ipynb
Outdated
Show resolved
Hide resolved
replaced by work in e-mission#102
work now in e-mission#102
This PR will now just check in Cemal's code which I remove in the full PR #102 |
To simplify review of #102, adding this as a "part 1" to those change, this includes everything that I did for the paper before I started working in the TSDC data, it includes all commits to #102 that happened before Dec 14th.
I'm not sure if this code would be the most stable on it's own, since I kept changing it from here and most of the work to make sure it was consistent and readable came after I started working with the TSDC data.
I don't think that at this point all the charts in the paper had made it into the "Abby" version of the code, since I was only working on what needed to change in order to submit the paper.
I've also included versions of the code both with and without outputs here, if there's anything else I can do to help streamline review please let me know!