-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Issue #1363] Update the Opportunity summary database models based on updated database schema plan #1402
Conversation
estimated_total_program_funding: Mapped[int | None] = mapped_column(BigInteger) | ||
award_floor: Mapped[int | None] = mapped_column(BigInteger) | ||
award_ceiling: Mapped[int | None] = mapped_column(BigInteger) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did some digging through the data and found several opportunities over the ~4 billion limit of an ordinary integer in Postgres.
For example, one for about $14 billion:
https://grants.gov/search-results-detail/349234
I did test that writing to the DB with very large values works uneventfully (note Python itself doesn't have a practical limit on integers)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, minor suggestions only. Thanks.
LookupColumn(LkOpportunityStatus), | ||
ForeignKey(LkOpportunityStatus.opportunity_status_id), | ||
) | ||
opportunity_id: Mapped[int] = mapped_column(ForeignKey(Opportunity.opportunity_id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember right, it's good to have index=True
on FKs so that more queries can be faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hadn't yet added any of the indexes - was going to add that when I setup the filters. I'll make a note of it on that ticket.
forecasted_post_date: Mapped[date | None] | ||
forecasted_close_date: Mapped[date | None] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if there's any value or benefit for us, but I found out there's a daterange
type in PostgreSQL.
https://docs.sqlalchemy.org/en/20/dialects/postgresql.html#range-and-multirange-types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could see cases when that might be helpful, but I think we'd generally want them separate going forward, its possible one can be null which would potentially make it confusing to interpret as a range.
Summary
Fixes #1363
Time to review: 10 mins
Changes proposed
Updated the DB models for the opportunity tables
Update the factories to handle creating a current opportunity summary in a few scenarios
Temporarily disable some validations in the API tests (To be fixed in #1364 )
Context for reviewers
https://app.gitbook.com/o/cFcvhi6d0nlLyH2VzVgn/s/v1V0jIH7mb7Yb3jlNrgk/engineering/learnings/opportunity-endpoint-data-model#overview provides most of the details regarding the updated schema, but in short, opportunity summary was adjusted to have its own primary key and allow for multiple opportunity summaries per opportunity - as well as some shuffling of fields.
There are two migrations in this PR, one to first delete the opportunity summary table, and the other to add/modify all of the other tables. Because we changed the primary key and how the connections between the tables work, Alembic couldn't automatically detect the changes necessary for creating the tables. While this could be modified manually, it ends up much cleaner / simpler to just drop and remake the table entirely instead. Non-locally we've not populated these tables so this has no impact.
Additional information
Running the
db-seed-local
script generates data locally as expected:In the current table, only 25 records are created as 5 opportunities have no summary:
The factories generate the opportunity summary with varying information, setting certain fields depending on the scenario (eg. only setting forecasted fields for forecasts):