Skip to content
This repository has been archived by the owner on May 15, 2024. It is now read-only.

Blob status API schema discussion: replicas array has no obvious uniqueness property #216

Open
rvagg opened this issue Nov 7, 2023 · 3 comments · May be fixed by #217
Open

Blob status API schema discussion: replicas array has no obvious uniqueness property #216

rvagg opened this issue Nov 7, 2023 · 3 comments · May be fixed by #217

Comments

@rvagg
Copy link
Member

rvagg commented Nov 7, 2023

See below as an example of what we're getting back for different "replica" statuses, in particular note how the same piece with the same SP appears twice even though we don't have two active deals with them, one is proposal_expired and one is proposal.

Singularity deals with this by giving a unique integer ID to "deals", but we strip that out and just return the { end_epoch, last_verified_at, piece_cid, state }. As a consumer of this information externally it's difficult to track replicas over time without a uniqueness property, or perhaps a guarantee that the array is fixed in order and will always just append over time.

I'm thinking that just adding "id": Integer might be the way to address this. Is there a reason we wouldn't want to expose an internal id sequence number?

{
  "id": "3eb42c0c-aa80-4807-96d9-6efe75b3b278",
  "replicas": [
    {
      "provider": "f02401",
      "pieces": [
        {
          "expiration": "2024-10-27T03:40:00Z",
          "lastVerified": "0001-01-01T00:00:00Z",
          "pieceCid": "baga6ea4seaqillixrqejui534uk4vz5seilo3i6jik3qxct2jxpnegzwcwh3ejy",
          "status": "active"
        }
      ]
    },
    {
      "provider": "f0447183",
      "pieces": [
        {
          "expiration": "2024-10-27T00:20:00Z",
          "lastVerified": "0001-01-01T00:00:00Z",
          "pieceCid": "baga6ea4seaqillixrqejui534uk4vz5seilo3i6jik3qxct2jxpnegzwcwh3ejy",
          "status": "active"
        }
      ]
    },
    {
      "provider": "f02837332",
      "pieces": [
        {
          "expiration": "2024-10-27T00:21:00Z",
          "lastVerified": "0001-01-01T00:00:00Z",
          "pieceCid": "baga6ea4seaqillixrqejui534uk4vz5seilo3i6jik3qxct2jxpnegzwcwh3ejy",
          "status": "active"
        }
      ]
    },
    {
      "provider": "f02837332",
      "pieces": [
        {
          "expiration": "2024-10-27T03:40:00Z",
          "lastVerified": "0001-01-01T00:00:00Z",
          "pieceCid": "baga6ea4seaqillixrqejui534uk4vz5seilo3i6jik3qxct2jxpnegzwcwh3ejy",
          "status": "proposal_expired"
        }
      ]
    },
    {
      "provider": "f0447183",
      "pieces": [
        {
          "expiration": "2024-10-27T03:40:00Z",
          "lastVerified": "0001-01-01T00:00:00Z",
          "pieceCid": "baga6ea4seaqillixrqejui534uk4vz5seilo3i6jik3qxct2jxpnegzwcwh3ejy",
          "status": "proposal_expired"
        }
      ]
    }
  ]
}

The Singularity table this comes out of looks like this:

       Column       |           Type           | Collation | Nullable |              Default
--------------------+--------------------------+-----------+----------+-----------------------------------
 id                 | bigint                   |           | not null | nextval('deals_id_seq'::regclass)
 created_at         | timestamp with time zone |           |          |
 updated_at         | timestamp with time zone |           |          |
 last_verified_at   | timestamp with time zone |           |          |
 deal_id            | bigint                   |           |          |
 state              | text                     |           |          |
 provider           | text                     |           |          |
 proposal_id        | text                     |           |          |
 label              | text                     |           |          |
 piece_cid          | bytea                    |           |          |
 piece_size         | bigint                   |           |          |
 start_epoch        | integer                  |           |          |
 end_epoch          | integer                  |           |          |
 sector_start_epoch | integer                  |           |          |
 price              | text                     |           |          |
 verified           | boolean                  |           |          |
 error_message      | text                     |           |          |
 schedule_id        | bigint                   |           |          |
 client_id          | character varying(15)    |           |          |
@rvagg
Copy link
Member Author

rvagg commented Nov 7, 2023

One discussion point here is in the translation of "deal" to "replica", is there a 1:1 mapping? Does this proposal_expired property expose a problem with this mapping? Initially I imagined that replicas would be unique on pieceCid+provider, but of course you could have the same piece multiple times with a provider, and this is still a "replica".

I wonder, if for "replicas", we should be filtering out anything that's not an actual replica—probably everything except active.

Here's the states we can have:

ModelDealStateProposed ModelDealState = "proposed"
ModelDealStatePublished ModelDealState = "published"
ModelDealStateActive ModelDealState = "active"
ModelDealStateExpired ModelDealState = "expired"
ModelDealStateProposalExpired ModelDealState = "proposal_expired"
ModelDealStateRejected ModelDealState = "rejected"
ModelDealStateSlashed ModelDealState = "slashed"
ModelDealStateError ModelDealState = "error"

rvagg added a commit that referenced this issue Nov 7, 2023
@rvagg rvagg linked a pull request Nov 7, 2023 that will close this issue
rvagg added a commit to filecoin-project/motion-sp-test that referenced this issue Nov 7, 2023
rvagg added a commit that referenced this issue Nov 7, 2023
@gammazero
Copy link
Collaborator

Does this deal with the case where a single replica spans multiple deals (because it exceeds the size of a sector)?

@xinaxu
Copy link
Collaborator

xinaxu commented Nov 13, 2023

I wonder, if for "replicas", we should be filtering out anything that's not an actual replica—probably everything except active.

  • active for sure
  • proposed and published, maybe, because they may eventually gets to active state. It will still be useful to show those status because it would be good to see the status a few minutes after the POST, so user knows that deals are being proposed.
  • all the rest, we should filter them out as they are not considered a replica

Still, there is a small chance that a single piece gets proposed multiple times with the same provider. I don't think it's a big trouble. The consumer of this JSON blob should count unique provider IDs that has state == "active"

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants