Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Script updates for creating panel tables and misc edits #109

Open
2 tasks
quyenkle opened this issue Jul 25, 2024 · 1 comment
Open
2 tasks

Script updates for creating panel tables and misc edits #109

quyenkle opened this issue Jul 25, 2024 · 1 comment

Comments

@quyenkle
Copy link

quyenkle commented Jul 25, 2024

Planned panel table updates:

  • add additional columns to tables from 2010 that don't include information past respondent_rssd
  • edit substring ranges of columns to assign values to the correct column

Misc edits in ts and lar scripts to correct paths/file types.

@quyenkle
Copy link
Author

quyenkle commented Sep 23, 2024

File errors

ts_2014

  • the original file had ts_2014 instead of ts_2014.txt

ts_2015

  • had to download from website, and do the same stuff that i had to do for ts_2016 but the 2015 txt data file was fine, i didn't have to make any changes, so it ran normally

ts_2016

  • had to download from website and edit the create_and_load file associated with this specific file, the default download gave me issues with the \t delimiter, so you'll have to copy the problem line to another editor (for me it was line 2674, the respondent_id is '0000021122'), delete the extra tab that was in the data, then paste it back. if you delete it in the original text file, it ruins the spacing on the other lines - it's weird.

panel code doesn't seem to include the top_holder information, it cuts off after the respondent rssd. at 2010 and on, they started to collect top_holder info? or it didn't show up before 2010, so the sql didn't pull from it.

panel_2014

  • i had to download it again, for some reason it was empty in my folder. also the formatting is similar to the 2013 version, so you're going to need to figure out the headers for that one again.

panel_2015

  • the substrings are a little funky

panel_2016

  • substrings are also a little funky

panel_2021

  • the code had arid_2017, which isn't in the data

lar_2015

  • i need to double check with someone to make sure that the data is right, i'm not entirely sure if some values should be empty or not

lar_2016

  • same as above, and also had to download it from the website

lar_2017

  • the data is read in weird from script, had to redownload and also add in to ignore the headers that were in the original csv data file.

lar_2018

  • not really sure why this one didn't work, so i just made a separate execution file to only do the 2018 lar data. i think they forgot a '/' for the data path...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant