Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue #1363] Update the Opportunity summary database models based on updated database schema plan #1402

Merged
merged 10 commits into from
Mar 7, 2024

Conversation

chouinar
Copy link
Collaborator

@chouinar chouinar commented Mar 5, 2024

Summary

Fixes #1363

Time to review: 10 mins

Changes proposed

Updated the DB models for the opportunity tables

Update the factories to handle creating a current opportunity summary in a few scenarios

Temporarily disable some validations in the API tests (To be fixed in #1364 )

Context for reviewers

https://app.gitbook.com/o/cFcvhi6d0nlLyH2VzVgn/s/v1V0jIH7mb7Yb3jlNrgk/engineering/learnings/opportunity-endpoint-data-model#overview provides most of the details regarding the updated schema, but in short, opportunity summary was adjusted to have its own primary key and allow for multiple opportunity summaries per opportunity - as well as some shuffling of fields.

There are two migrations in this PR, one to first delete the opportunity summary table, and the other to add/modify all of the other tables. Because we changed the primary key and how the connections between the tables work, Alembic couldn't automatically detect the changes necessary for creating the tables. While this could be modified manually, it ends up much cleaner / simpler to just drop and remake the table entirely instead. Non-locally we've not populated these tables so this has no impact.

Additional information

Running the db-seed-local script generates data locally as expected:
Screenshot 2024-03-05 at 1 50 00 PM

In the current table, only 25 records are created as 5 opportunities have no summary:
Screenshot 2024-03-05 at 1 50 57 PM

The factories generate the opportunity summary with varying information, setting certain fields depending on the scenario (eg. only setting forecasted fields for forecasts):
Screenshot 2024-03-05 at 1 51 59 PM

@chouinar chouinar requested review from jamesbursa and removed request for sumiat March 5, 2024 18:53
@github-actions github-actions bot added documentation Improvements or additions to documentation python api database labels Mar 5, 2024
@chouinar chouinar removed the request for review from widal001 March 5, 2024 18:53
Comment on lines +95 to +97
estimated_total_program_funding: Mapped[int | None] = mapped_column(BigInteger)
award_floor: Mapped[int | None] = mapped_column(BigInteger)
award_ceiling: Mapped[int | None] = mapped_column(BigInteger)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did some digging through the data and found several opportunities over the ~4 billion limit of an ordinary integer in Postgres.

For example, one for about $14 billion:
https://grants.gov/search-results-detail/349234

I did test that writing to the DB with very large values works uneventfully (note Python itself doesn't have a practical limit on integers)

Copy link
Collaborator

@jamesbursa jamesbursa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, minor suggestions only. Thanks.

api/src/db/models/opportunity_models.py Outdated Show resolved Hide resolved
LookupColumn(LkOpportunityStatus),
ForeignKey(LkOpportunityStatus.opportunity_status_id),
)
opportunity_id: Mapped[int] = mapped_column(ForeignKey(Opportunity.opportunity_id))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember right, it's good to have index=True on FKs so that more queries can be faster.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't yet added any of the indexes - was going to add that when I setup the filters. I'll make a note of it on that ticket.

Comment on lines +103 to +104
forecasted_post_date: Mapped[date | None]
forecasted_close_date: Mapped[date | None]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if there's any value or benefit for us, but I found out there's a daterange type in PostgreSQL.

https://docs.sqlalchemy.org/en/20/dialects/postgresql.html#range-and-multirange-types

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see cases when that might be helpful, but I think we'd generally want them separate going forward, its possible one can be null which would potentially make it confusing to interpret as a range.

api/src/db/models/opportunity_models.py Outdated Show resolved Hide resolved
api/src/db/models/opportunity_models.py Outdated Show resolved Hide resolved
@chouinar chouinar merged commit f72d63b into main Mar 7, 2024
8 checks passed
@chouinar chouinar deleted the chouinar/next-iteration-of-db-tables branch March 7, 2024 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api database documentation Improvements or additions to documentation python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Task]: Adjust the database schema of the opportunity tables
3 participants