Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support bulk updates using PATCH #2693

Closed
wants to merge 2 commits into from

Conversation

laurenceisla
Copy link
Member

@laurenceisla laurenceisla commented Mar 1, 2023

Closes #1959. It allows bulk updates using PATCH only when the header Prefer: params=multiple-objects is applied as suggested here #1959 (comment).

@laurenceisla laurenceisla marked this pull request as ready for review March 1, 2023 23:26
Comment on lines +661 to +702
it "does not update when no pk is specified in the body" $ do
request methodPatch "/bulk_update_items"
[("Prefer", "tx=commit"), ("Prefer", "params=multiple-objects")]
[json|[
{ "name": "Item 1" }
, { "name": "Item 2" }
]|]
`shouldRespondWith` 400
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, the error is returned by the database and has this form:

{
    "code": "42703",
    "details": null,
    "hint": null,
    "message": "column pgrst_body.id does not exist"
}

It looks a bit confusing, maybe an SchemaCache error could be created to specify that this is an issue with bulk updates and no PK is found in the body. @steve-chavez WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laurenceisla That would be much better 👍

Why does it assume an id column on the error? Is it hardcoded?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it hardcoded?

No, in this case, the PK for the table is id. The next test example has a composite PK of id,name and only id is specified in the body. In that case, the message would be:

{
    "code": "42703",
    "details": null,
    "hint": null,
    "message": "column pgrst_body.name does not exist"
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, got it. Q: What does happen when specifying ?columns=id and the body doesn't have an id?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not give an error unlike normal PATCH. That's because the WHERE with the primary key is forced in the bulk update:

WHERE "test"."bulk_update_items"."id" = "pgrst_body"."id"

In this case, the pgrs_body.id is null, so the query works but it won't update any row.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's fine.

In this case, the pgrs_body.id is null, so the query works but it won't update any row.

I guess one way to prevent this would be to count the array elements and compare it to the updated items. If they're equal then we can assume a unique key was passed for every row, if not, we could throw an error.

Not sure if worth it though, it could be confusing as well because we can't tell the user which array element would have the incorrect unique.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if worth it though, it could be confusing as well because we can't tell the user which array element would have the incorrect unique.

Yeah, I'm not sure about this one either, maybe it's better to discuss later for another PR.

It looks a bit confusing, maybe an SchemaCache error could be created to specify that this is an issue with bulk updates and no PK is found in the body.

I mentioned this but, as you said on another chat, using it with ?columns won't allow to verify if the JSON body has the id. So I'll go with the names you suggested maybe json_body or bulk_update_body instead of pgrst_body.

Comment on lines 600 to 615
it "updates with limit and offset taking only the first item in the json array body" $ do
baseTable "bulk_update_items" "id" tblDataBeforeBulk
`mutatesWith`
requestMutation methodPatch "/bulk_update_items?limit=2&offset=1&order=id"
[("Prefer", "params=multiple-objects")]
[json|[
{ "name": "item", "observation": "Lost item" }
, { "name": "item-2 - 2nd", "observation": "Damaged item" }
, { "name": "item-3 - 3rd", "observation": null }
]|]
`shouldMutateInto`
[json|[
{ "id": 1, "name": "item-1", "observation": null }
, { "id": 2, "name": "item", "observation": "Lost item" }
, { "id": 3, "name": "item", "observation": "Lost item" }
]|]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit confused by this one. So any value on limit and offset will have the same outcome?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words, it works as a normal update (without bulk): it takes the first element in the body and updates applying limits/offsets. Now that you mention it, it could be confusing if the user expects a bulk update of some kind, maybe a custom error should be expected here, like the one where limit/offset are not allowed for PUT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear, limit/offset are ignored right? So limit=100&offset=400 would result on the same?

Now that you mention it, it could be confusing if the user expects a bulk update of some kind, maybe a custom error should be expected here, like the one where limit/offset are not allowed for PUT?

Yeah, I agree. I think we can ban limit/offset for bulk updates.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear, limit/offset are ignored right? So limit=100&offset=400 would result on the same?

No, it does not ignore limits and the result would be different depending on the values. In other words. It works as if it was this test

it "works with views with an explicit order by unique col" $
baseTable "limited_update_items_view" "id" tblDataBefore
`mutatesWith`
requestMutation methodPatch "/limited_update_items_view?order=id&limit=1&offset=1"
[json| {"name": "updated-item"} |]
`shouldMutateInto`
[json|[
{ "id": 1, "name": "item-1" }
, { "id": 2, "name": "updated-item" }
, { "id": 3, "name": "item-3" }
]|]

Copy link
Member Author

@laurenceisla laurenceisla Mar 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented the Error handling for this case and I noticed that the PATCH and DELETE with limits and without order in the query string throw this error:

| method `elem` ["PATCH", "DELETE"] && not (null qsRanges) && null qsOrder = Left LimitNoOrderError

The thing is that using the Range header with order in the query string, works as if we were using limit and offset. Also, using Range without order throws a different error that reaches the database, with this message:

"message": "syntax error at or near \"RETURNING\""

@steve-chavez Is this behavior expected? This is unrelated to this PR, btw and is reproducible in the main branch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, using Range without order throws a different error that reaches the database, with this message:
"message": "syntax error at or near "RETURNING""

Yeah that seems like a bug. Fixing it in another PR would be great.

Comment on lines +647 to +693
it "rejects a json array that has objects with different keys" $
request methodPatch "/bulk_update_items"
[("Prefer", "tx=commit"), ("Prefer", "params=multiple-objects")]
[json|[
{ "id": 1, "name": "Item 1" }
, { "id": 2, "name": "Item 2" }
, { "id": 3, "name": "Item 3", "observation": "New item" }
]|]
`shouldRespondWith`
[json| {"message":"All object keys must match","code":"PGRST102","hint":null,"details":null} |]
{ matchStatus = 400
, matchHeaders = [matchContentTypeJson]
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this is definitely undesirable. Users will want to update different columns for each array element. I think we'll need #2672 for solving this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it can be solved without #2672. So assume you specify columns here or another test. You could do:

UPDATE bulk_update_items SET observation = COALESCE(pgrst_body.observation, observation) WHERE ..

Meaning that a value will be updated to the new value in the body or to its old value.

This will allow updating rows with a changing number of columns. WDYT?

Copy link
Member Author

@laurenceisla laurenceisla Mar 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After implementing this I noticed that setting a value to null will fail, e.g.:

PATCH /bulk_update_items?columns=id,name,observation
[
            { "id": 1, "name": "Item 1" }
          , { "id": 3, "name": "Item 3", "observation": null }
]

The item with id=3 will keep its old observation and won't be set to null. Is there a conditional that could be used here, to only apply COALESCE in these cases? Although, I see it difficult due to the fact that columns does not check the body, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Wolfgang suggested a way to do this with the ? operator on jsonb here.

On #2672 I had the same problem as well, but I decided to go for the alternative of using the || operator on jsonb.

Maybe those approaches could help.

Copy link
Member Author

@laurenceisla laurenceisla Mar 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up using ? instead. Although it works now, I know that it isn't too performant, but when i used || , I got this query:

UPDATE "test"."bulk_update_items"
SET "id" = "pgrst_body"."id",
    "name" = "pgrst_body"."name",
    "observation" = "pgrst_body"."observation"
FROM
    (SELECT $1::jsonb AS json_data) pgrst_payload
    CROSS JOIN LATERAL
        (SELECT CASE WHEN jsonb_typeof(pgrst_payload.json_data) = 'array' THEN pgrst_payload.json_data ELSE jsonb_build_array(pgrst_payload.json_data) END AS val) pgrst_uniform_json
    CROSS JOIN LATERAL
        (SELECT jsonb_agg(jsonb_build_object('id', "test"."bulk_update_items"."id",'name', "test"."bulk_update_items"."name",'observation', "test"."bulk_update_items"."observation") || elem) AS val
         FROM jsonb_array_elements(pgrst_uniform_json.val) elem) pgrst_json_old_vals
    CROSS JOIN LATERAL
        (SELECT *
         FROM jsonb_to_recordset(pgrst_json_old_vals.val) AS _("id" integer, "name" text, "observation" text)) pgrst_body
WHERE "test"."bulk_update_items"."id" = "pgrst_body"."id" RETURNING 1;

Which gives an error when accessing test.bulk_update_items inside the pgrst_json_old_vals subquery due to the CROSS JOIN. Couldn't find a way to make it work that way, maybe I'm missing something.

Comment on lines -219 to +221
= ApplyDefaults -- ^ Use the default column value for the unspecified keys.
| IgnoreDefaults -- ^ Inserts: null values / Updates: the keys are not SET to any value
= ApplyDefaults -- ^ Inserts and Updates: Use the default column value for the unspecified keys.
| IgnoreValues -- ^ Inserts: null values | Updates: the keys are not SET to any value
| ApplyNulls -- ^ Inserts: null values | Updates: SET the values to null
Copy link
Member Author

@laurenceisla laurenceisla Mar 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that the comment I added in PR #2672 is wrong.

| IgnoreDefaults -- ^ Inserts: null values / Updates: the keys are not SET to any value

When no header is sent, it updates applying null values instead of "ignoring" the key and keeping the old value.

I added other header as a proposal:

  • IgnoreDefaults is now ApplyNulls and is the default behavior when no header is sent, that is, it updates the unspecified columns to null values.
  • IgnoreValues does not set to any value (in reality it sets to the original value but the result is the same).

@laurenceisla laurenceisla marked this pull request as ready for review March 25, 2023 08:28
@laurenceisla laurenceisla marked this pull request as draft March 25, 2023 08:28
@steve-chavez
Copy link
Member

Will close the PR because it got too stale. Also the design of the feature changed on #1959 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Support proper bulk update via PATCH
2 participants