feat: add table `eligible_deals` #485

bajtos · 2025-01-08T08:58:50Z

We need to enhance the list of deals eligible for retrieval testing with new columns (piece_cid, piece_size). This change involves several components, it would be tricky to deploy it incrementally while not breaking spark-api and fil-deal-ingester.

Since the current table retrievable_deals has a confusing name, I decided to create a new table eligible_deals that we can incrementally populate without breaking the existing operations, and switch over when everything is ready.

This pull request adds a DB migration script to set up the new table.

Links:

Signed-off-by: Miroslav Bajtoš <[email protected]>

juliangruber · 2025-01-08T12:53:27Z

This pull request adds a DB migration script to set up the new table.

What is the motivation for not including the logic for populating the table in this PR as well?

bajtos · 2025-01-08T13:19:25Z

This pull request adds a DB migration script to set up the new table.

What is the motivation for not including the logic for populating the table in this PR as well?

Great question.

First of all, the table is populated by fil-deal-ingester, see filecoin-station/fil-deal-ingester#30

In the past, I always included ~10k deals to seed the table. Eventually, we ended with more than 5 old schema migration scripts populating 10k deals that were deleted by subsequent migration scripts. Because DB schema migrations are immutable, we have to keep them around forever. I find that rather inefficient.

So this time, I want to try something else:

Keep the table empty,
If tests need some deals in the DB, populate the deals from a test data builder script.

These test data builders can be updated as we evolve the schema.

Now that I wrote this, I realised a potential flaw:

In the current setup with 10k deals in the DB, when we write a new database migration modifying the table with eligible deals, the migration is automatically tested and verified just by running the entire DB migration from an empty DB to the latest schema version.
In the new setup, database migrations modifying eligible_deals will be executed against an empty table, so we won't catch certain classes of errors like adding a new column with NOT NULL constraint and no default value set.

In that light, I am going to add 10k rows to eligible_deals to seed the new table.

Signed-off-by: Miroslav Bajtoš <[email protected]>

bajtos · 2025-01-08T13:23:39Z

@juliangruber PTAL again

juliangruber · 2025-01-09T09:48:24Z

Got it, again it was hard to reason about the changes as another repository is responsible for managing the table's data. Can you please add a comment to either migration file, so that a reader can understand where the production data will come from?

migrations/062.do.eligible-deals.sql

juliangruber · 2025-01-09T09:52:38Z

Got it, again it was hard to reason about the changes as another repository is responsible for managing the table's data. Can you please add a comment to either migration file, so that a reader can understand where the production data will come from?

For clarity, I would prefer if the script managing the table's data and the management of the db lived in the same repo. This could for example be achieved by having a deals service that takes care of ingestion and then exposes via a REST Api. Or, the deal ingester could be another submodule of spark-api.

This comment doesn't affect that PR.

Co-authored-by: Julian Gruber <[email protected]>

bajtos · 2025-01-09T12:15:47Z

Got it, again it was hard to reason about the changes as another repository is responsible for managing the table's data. Can you please add a comment to either migration file, so that a reader can understand where the production data will come from?

For clarity, I would prefer if the script managing the table's data and the management of the db lived in the same repo. This could for example be achieved by having a deals service that takes care of ingestion and then exposes via a REST Api. Or, the deal ingester could be another submodule of spark-api.

I agree this is not optimal.

For Spark v2, I envision a new component that will manage the list of eligible deals and provide a REST API to create a sample of active deals. This will allow us to simplify spark-api - the round tracker will call that new REST API to obtain a list of tasks for the newly started round and we won't need to manage the table with eligible deals in spark-api at all.

We will start working on that new component very soon - see Deal Observer: initial implementation (deal activation events).

For now (Spark v1 and v1.5), we need to keep the existing table and fil-deal-ingester service. Having written that, I can imagine bringing fil-deal-ingester into this repository, so that we can make changes to the database schema and the code writing to the tables together in one PR. On the other hand, there were only 6 migration scripts touching the list of eligible deals in the last year, the last one in August 2024.

@juliangruber WDYT? Is it worth investing our time into improving this area?

Signed-off-by: Miroslav Bajtoš <[email protected]>

juliangruber · 2025-01-09T12:44:19Z

Sounds great! I think it's fine to keep things separated for now as Spark v2 will revisit this anyway

bajtos requested a review from juliangruber January 8, 2025 08:58

feat: add table eligible_deals

41d8d50

Signed-off-by: Miroslav Bajtoš <[email protected]>

bajtos force-pushed the feat-eligible-deals-with-piece-info branch from fed4add to 41d8d50 Compare January 8, 2025 09:00

bajtos requested a review from NikolasHaimerl January 8, 2025 09:01

bajtos mentioned this pull request Jan 8, 2025

feat: switch to eligible_deals + set piece info filecoin-station/fil-deal-ingester#30

Merged

bajtos marked this pull request as ready for review January 8, 2025 09:03

Merge branch 'main' into feat-eligible-deals-with-piece-info

0452b1e

bajtos mentioned this pull request Jan 8, 2025

Extend eligible deals schema to enable adding DDO deals filecoin-station/fil-deal-ingester#29

Closed

5 tasks

fixup! add 10k eligible deals to seed the new table

7c365eb

Signed-off-by: Miroslav Bajtoš <[email protected]>

bajtos force-pushed the feat-eligible-deals-with-piece-info branch from 48a9653 to 7c365eb Compare January 8, 2025 13:23

juliangruber requested changes Jan 9, 2025

View reviewed changes

migrations/062.do.eligible-deals.sql Show resolved Hide resolved

Update migrations/062.do.eligible-deals.sql

016e99c

Co-authored-by: Julian Gruber <[email protected]>

fixup! reformat SQL comment

af69eb4

Signed-off-by: Miroslav Bajtoš <[email protected]>

bajtos requested a review from juliangruber January 9, 2025 12:18

juliangruber approved these changes Jan 9, 2025

View reviewed changes

bajtos merged commit aaa852b into main Jan 9, 2025
8 checks passed

bajtos deleted the feat-eligible-deals-with-piece-info branch January 9, 2025 12:49

bajtos mentioned this pull request Jan 10, 2025

feat: sample retrieval tasks from eligible_deals #486

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add table `eligible_deals` #485

feat: add table `eligible_deals` #485

bajtos commented Jan 8, 2025 •

edited

Loading

juliangruber commented Jan 8, 2025

bajtos commented Jan 8, 2025

bajtos commented Jan 8, 2025

juliangruber commented Jan 9, 2025

juliangruber commented Jan 9, 2025 •

edited

Loading

bajtos commented Jan 9, 2025

juliangruber commented Jan 9, 2025

feat: add table eligible_deals #485

feat: add table eligible_deals #485

Conversation

bajtos commented Jan 8, 2025 • edited Loading

juliangruber commented Jan 8, 2025

bajtos commented Jan 8, 2025

bajtos commented Jan 8, 2025

juliangruber commented Jan 9, 2025

juliangruber commented Jan 9, 2025 • edited Loading

bajtos commented Jan 9, 2025

juliangruber commented Jan 9, 2025

feat: add table `eligible_deals` #485

feat: add table `eligible_deals` #485

bajtos commented Jan 8, 2025 •

edited

Loading

juliangruber commented Jan 9, 2025 •

edited

Loading