Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

style(black): format acquisition with black, line-length=100 #1189

Merged
merged 22 commits into from
Jun 26, 2023

Conversation

dshemetov
Copy link
Contributor

@dshemetov dshemetov commented Jun 5, 2023

A version of #1186 with black's line-length=100. Recommend using GitHub's ignore whitespace feature when looking at diffs.

TL;DR: it mostly looks good. The main jank is, as Katie put it below, "repeated calls should use the same wrap mode".

  • Add .pylintrc configs from covidcast-indicators. Don't enforce pylint, just keep the same configs.
  • Apply pyupgrade for a few minor changes.
  • Use flynt to automatically convert many of the % and .format strings into official f-strings. It was pretty straightforward pip install flynt and flynt src/acquisition -a -l 999 --verbose (-a is aggressive, it won't change % strings otherwise, -l 999 is line-length, it won't change long strings otherwise, --verbose is to have it log its changes). With this, it had a 94/98 success rate on %-style expressions, so I had to go fix a few manually.
  • Write the correct commits into ignore blame refs file, once this PR is finalized

src/acquisition/afhsb/afhsb_csv.py Outdated Show resolved Hide resolved
src/acquisition/afhsb/afhsb_csv.py Outdated Show resolved Hide resolved
src/acquisition/covid_hosp/common/database.py Outdated Show resolved Hide resolved
sql = (
f"INSERT INTO `{self.table_name}` (`id`, `{self.publication_col_name}`, {columns}) "
f"VALUES ({value_placeholders})"
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more vertical space waste

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

disagree; in this case highlighting INSERT INTO and VALUES improves scan-ability

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separating those two was already in the original code though. black added the parentheses, doubling the the vertical space of this code :/

src/acquisition/covid_hosp/facility/database.py Outdated Show resolved Hide resolved
src/acquisition/covid_hosp/state_daily/database.py Outdated Show resolved Hide resolved
src/acquisition/covid_hosp/state_timeseries/database.py Outdated Show resolved Hide resolved
src/acquisition/ght/ght_update.py Show resolved Hide resolved
src/acquisition/norostat/norostat_add_history.py Outdated Show resolved Hide resolved
Copy link
Contributor

@nmdefries nmdefries left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this one a lot better than with line-length=200. The formatting is acceptable overall and the wrapping style is more readable.

It's not 100% ideal in several places, but there are too many changes here to futz around with manually fixing and freezing more than a few individual lines.

I'd say the most worthwhile use of time would be deciding how we want to format the Columndef("previous_day_admission_adult_covid_confirmed_20-29_7_day_sum", "previous_day_admission_adult_covid_confirmed_20_29_7_day_sum", int), lines in all the database.py variants, and call it good after that.

Comment on lines 163 to 167
if logger and len(dfs) > 7:
logger.warning(
"expensive operation",
msg="concatenating more than 7 files may result in long running times",
count=len(dfs),
Copy link
Contributor

@nmdefries nmdefries Jun 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better in this version

Comment on lines +197 to +207
"REGION TYPE",
"REGION",
"YEAR",
"WEEK",
"TOTAL SPECIMENS",
"TOTAL A",
"TOTAL B",
"PERCENT POSITIVE",
"PERCENT A",
"PERCENT B",
]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better wrapping here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? I don't notice a change

Copy link
Contributor

@nmdefries nmdefries Jun 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the ambiguity, I was comparing to the line-length: 200 behavior here. Some other comments in this review are also discussing changes compared to the length-200 version, rather than the before-and-after here.

'--test',
action='store_true',
help='do dry run only, do not update the database'
"--test", action="store_true", help="do dry run only, do not update the database"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same with the vertical space comment

Comment on lines +348 to +355
args = [
row["total_specimens"],
row["total_a"],
row["total_b"],
row["percent_positive"],
row["percent_a"],
row["percent_b"],
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not the best use of space but acceptable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i prefer it

src/acquisition/twtr/healthtweets.py Outdated Show resolved Hide resolved
@dshemetov
Copy link
Contributor Author

dshemetov commented Jun 7, 2023

Absolutely agree with your review comment.

Formatting ColumnDef lines is actually moot, since we have a refactor that removes those lines already.

So I'm just going to exclude covid_hosp from this PR, to avoid conflicts with that PR. We can format that code after those refactors.

Copy link
Contributor

@krivard krivard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Places it falls over:

  • repeated calls should use the same wrap mode
  • same-line comments on too-long lines should be moved to the previous line before applying wrapping rules

otherwise seems pretty good!

zip_path = f"{DELPHI_BASE_DIR}/dropbox_{timestamp}.zip"
print(f"downloading into delphi:{zip_path}")
with ZipFile(zip_path, "w", ZIP_DEFLATED) as zf:
for name in save_list:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is when I wish github let you provide manual alignment markers to make diffs make sense

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you do "hide whitespace" up in the gear above, it makes these diffs way more readable

src/acquisition/cdcp/cdc_extract.py Outdated Show resolved Hide resolved
src/acquisition/covid_hosp/common/database.py Outdated Show resolved Hide resolved
sql = (
f"INSERT INTO `{self.table_name}` (`id`, `{self.publication_col_name}`, {columns}) "
f"VALUES ({value_placeholders})"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

disagree; in this case highlighting INSERT INTO and VALUES improves scan-ability

Comment on lines 520 to 526
"--test", action="store_true", help="do dry run only, do not update the database"
)
parser.add_argument(
"--file", type=str, help="load an existing zip file (otherwise fetch current data)"
)
parser.add_argument(
"--issue", type=int, help="issue of the file (e.g. 201740); used iff --file is given"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmmm I'm skeptical about these. if it were just one, fine, but with three of them it's helpful to have whitespace highlight argument alignment

src/acquisition/twtr/healthtweets.py Outdated Show resolved Hide resolved
Comment on lines +197 to +207
"REGION TYPE",
"REGION",
"YEAR",
"WEEK",
"TOTAL SPECIMENS",
"TOTAL A",
"TOTAL B",
"PERCENT POSITIVE",
"PERCENT A",
"PERCENT B",
]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? I don't notice a change

src/acquisition/norostat/norostat_add_history.py Outdated Show resolved Hide resolved
src/acquisition/ght/ght_update.py Show resolved Hide resolved
src/acquisition/afhsb/afhsb_csv.py Outdated Show resolved Hide resolved
@dshemetov dshemetov marked this pull request as ready for review June 21, 2023 20:33
@dshemetov
Copy link
Contributor Author

Trying to test and fix the regex in question that SonarCloud doesn't like, but the norostat code doesn't even seem to be run correctly. It breaks in line 74 of that same file, where the output of html_root.xpath(...) is just [] for me 🤷 @krivard this makes me think this source doesn't run. Is it worth fixing?

@dshemetov dshemetov changed the title [test] try black line-length=100 style(black): format acquisition with black, line-length=100 Jun 21, 2023
@krivard
Copy link
Contributor

krivard commented Jun 22, 2023

this makes me think this source doesn't run. Is it worth fixing?

we're still running a norostat cronicle job daily, but it's been failing since November 2020. Checking with R&R whether to keep it or not.

@krivard krivard mentioned this pull request Jun 23, 2023
4 tasks
@krivard
Copy link
Contributor

krivard commented Jun 23, 2023

We can kill norostat acquisition

@dshemetov dshemetov changed the base branch from dev to ds/del-norostat-acquisition June 23, 2023 18:41
@dshemetov
Copy link
Contributor Author

Dropped Norostat changes from this and changed the base to the no-Norostat PR

Base automatically changed from ds/del-norostat-acquisition to dev June 23, 2023 19:51
@sonarcloud
Copy link

sonarcloud bot commented Jun 26, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 42 Code Smells

No Coverage information No Coverage information
2.3% 2.3% Duplication

Copy link
Contributor

@krivard krivard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

thanks for fixing all the fmt: on|off cases!

Copy link
Contributor

@nmdefries nmdefries left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@dshemetov dshemetov merged commit 0609700 into dev Jun 26, 2023
@dshemetov dshemetov deleted the ds/format100 branch June 26, 2023 20:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants