Json to recordset #2542

aljungberg · 2022-10-27T17:01:34Z

Switches from json_populate_recordset to json_to_recordset, which we will need for data representations. See #2523 for details.

The impossible panic we discussed in #2523 has been eliminated by creating a TypedField type which doesn't allow that situation.

src/PostgREST/ApiRequest/Types.hs

src/PostgREST/Error.hs

src/PostgREST/Plan.hs

src/PostgREST/ApiRequest/Types.hs

wolfgangwalther · 2022-10-27T17:34:51Z

@steve-chavez we should certainly wait for the next minor release before merging this or #2523 to reduce the risk of breaking things and first getting to a release, which is upgrade-able to from previous major versions.

src/PostgREST/SchemaCache/Table.hs

src/PostgREST/Plan/Types.hs

aljungberg · 2022-10-28T13:48:17Z

Regarding the code coverage on Ord, what's the best approach there? I could stick a few tests in that do some compare exercises, but didn't see anywhere else we do that. Could just drop Ord fromTable, seems to be unused, which would mean we don't need it on ColumnMap either. We only need Hashable for the maps. Thoughts?

wolfgangwalther · 2022-10-28T14:29:22Z

Regarding the code coverage on Ord, what's the best approach there? I could stick a few tests in that do some compare exercises, but didn't see anywhere else we do that. Could just drop Ord fromTable, seems to be unused, which would mean we don't need it on ColumnMap either. We only need Hashable for the maps. Thoughts?

I'd say: Drop everything we don't need. Let's keep it at the minimum, to have better coverage reports. Then add those instances later, once you need them.

aljungberg · 2022-10-28T15:10:52Z

Not entirely sure what this means:

Clearly insCol is tested since it's used by the insertion tests?

wolfgangwalther · 2022-10-28T15:19:27Z

Clearly insCol is tested since it's used by the insertion tests?

Actually, it's not. The record field accessor insCol is not used, because we only pattern match the MutatePlan, but do not use any of the record syntax.

You'll get reports for lines that were uncovered before your changes, too. So some of those codecov warnings can't really be avoided.

steve-chavez · 2022-11-01T04:00:44Z

Switches from json_populate_recordset to json_to_recordset

Would it be possible to fallback to json_populate_recordset in case the columns of a relation are not present on the schema cache?

For example, this could happen if a new table is added and the schema cache is not refreshed. We remained operational before for these cases, what would happen now?

aljungberg · 2022-11-01T16:14:32Z

Switches from json_populate_recordset to json_to_recordset

Would it be possible to fallback to json_populate_recordset in case the columns of a relation are not present on the schema cache?

For example, this could happen if a new table is added and the schema cache is not refreshed. We remained operational before for these cases, what would happen now?

The new behaviour is that you get an error when trying to insert data into a table or a column we believe not to exist.

Personally I think that's actually an improvement over the old behaviour. Clear and immediate messaging for a situation that is more likely to be an error than not.

Like how often do you expect to make API calls to tables so fresh off the presses they're not even in the cache? The only time I can imagine is if you're live developing your schema. In that case can't you do NOTIFY pgrst, 'reload schema' at the end of your edit transaction as well?

What's the venn diagram between people live editing their schema yet unwilling to NOTIFY? I imagine it's pretty small.

If we make it a stated goal to always allow interaction with unknown tables and columns, data representations will have confusing failure modes (e.g. they work some of the time).
Even if you don't use data representations, you can have confusing behaviour with regards to relations, embedding and so forth if you are in the habit of changing your schema without reloading the cache.
There's more complexity in our codebase having fallbacks.
To support json_populate_recordset as a fallback, the constraints on TypedField would have to be relaxed. The record started out relaxed and I made it more strict due to the feedback on the data rep PR. To be clear, I'm happy to do the work, but we might be pulling in two directions at once here.

All that said, happy to change it. I'm really excited about this feature and I think it'll be a great addition to PostgREST, so keen to get it merged. We're going to be using this probably in every single API on our end.

wolfgangwalther · 2022-11-01T16:18:20Z

Personally I think that's actually an improvement over the old behaviour.

I absolutely agree. I think we should strive to become more strict. Hopefully this should lead people to learn about reloading the schema cache much earlier, too - and not only when they try to make sense of a rather complicated topic with embeddings. Making a basic request, seeing it fail, reloading the schema cache and then seeing it pass should give users (and us) much more confidence that they at least can successfully reload that cache.

wolfgangwalther · 2022-11-01T16:20:12Z

I'm really excited about this feature and I think it'll be a great addition to PostgREST, so keen to get it merged. We're going to be using this probably in every single API on our end.

Yeah, same here - just too much on my plate to keep reviewing immediately. But I see use cases for it in all my projects, too :)

src/PostgREST/Error.hs

steve-chavez · 2022-11-01T17:24:43Z

Cool, I agree that becoming more strict is good overall.

Not sure if stuffing schema cache info on the details key of the error message is the right approach. The schema cache is becoming more relevant with this change and maybe we should make it more visible.

How about we add a special header on the endpoints to get their schema cache entry(discussed before on #1421).

GET /projects
Accept: application/vnd.pgrst.schema-cache

{
  "columns": [".."],
  "relationships": [".."]
}

The above can be done on another PR but I'd consider it a must for a stable release after this feature is merged.

aljungberg · 2022-11-01T17:50:32Z

Not sure if stuffing schema cache info on the details key of the error message is the right approach. The schema cache is becoming more relevant with this change and maybe we should make it more visible.

How about we add a special header on the endpoints to get their schema cache entry(discussed before on #1421).
GET /projects
Accept: application/vnd.pgrst.schema-cache

{
  "columns": [".."],
  "relationships": [".."]
}
The above can be done on another PR but I'd consider it a must for a stable release after this feature is merged.

Having a per relation info API would be a nice addition.

As you rightfully point out in the referenced PR, the schema cache is already visible through the root endpoint so perhaps such an API would be more of a nice to have?

Maybe there's something we can do there to present more information about the available data representations for each column. Haven't fully thought that through yet, but if we add support for CSV and binary data representations down the line it might be nice to see which column custom formatters and/or parsers are available, and which ones will rely on the base type native PostgreSQL casting.

wolfgangwalther · 2022-11-01T18:45:28Z

How about we add a special header on the endpoints to get their schema cache entry(discussed before on #1421).

How about making a request to the root endpoint with a different accept header, i.e. the one you mentioned.. and then returning schema cache instead of OpenAPI output? Filtering the root endpoint could then be done the same way for both.

steve-chavez · 2022-11-02T15:31:54Z

How about making a request to the root endpoint with a different accept header, i.e. the one you mentioned.. and then returning schema cache instead of OpenAPI output?

Yeah, that could be another option. I've branched off the discussion to #1421 (comment) to not make this thread longer.

src/PostgREST/Plan/Types.hs

wolfgangwalther · 2022-11-08T07:06:44Z

Could you please do the following to make it easier for me:

squash all commits in this PR into one commit, with the proper semantic versioning prefix and a good commit message. I'd say this is a feat:, because we're changing the way we're validating the ?columns for INSERT and UPDATE. Are we throwing the same kind of error messages before? In that case the feature is probably "throwing those errors without hitting the database"?
Add a changelog entry for that change.
Remove all the changes that are only required in the next PR as mentioned above. Make this self-contained.

I will then have another look. Thanks!

aljungberg · 2022-11-08T13:46:56Z

Are we throwing the same kind of error messages before? In that case the feature is probably "throwing those errors without hitting the database"?

The essence of the error is the same but the error message now has a PostgREST error code and indeed happens without hitting the database.

# Before
{"code":"42703","details":null,"hint":null,"message":"column \"helicopter\" of relation \"articles\" does not exist"}
# After
{"code":"PGRST118","details":null,"hint":"If a new column was created in the database with this name, try reloading the schema cache.","message":"Could not find 'helicopter' in the target table"}

src/PostgREST/Plan.hs

src/PostgREST/Query/QueryBuilder.hs

wolfgangwalther

I have most of my suggested changes locally already. If you want to save some effort, you can check the "allow edits of maintainers" box (not sure what the name is exactly). Then I can push those to your branch.

src/PostgREST/Plan.hs

src/PostgREST/Plan/Types.hs

src/PostgREST/Query/QueryBuilder.hs

aljungberg · 2022-11-25T19:20:48Z

I believe all feedback up to this point has been addressed.

This returns an error for trying to update or insert into invalid columns, without hitting the database. This change also switches from `json_populate_recordset` for these operations `json_to_recordset` which should make no functional difference except allowing future flexibility. - New: store columns in a map, grab true column types. - New: `json_populate_recordset` -> `json_to_recordset`. This lays the groundwork for data representations, but should make no functional difference at this stage except that we now have an explicit error for trying to mutate tables or columns that don't exist (according to the schema cache). - New: test missing column errors. - Drop unused `Ord` to increase test coverage. The `ColumnMap` `Ord` was only there because `Table` derived `Ord`. Doesn't seem like that's used anywhere for `Table` to begin with so dropped both.

This left to unnecessary type conversions.

It was only used to put `TypedField` into sets which we no longer do after the previous commit.

aljungberg · 2022-11-28T11:05:52Z

This PR is now rebased on the latest main

src/PostgREST/Error.hs

Co-authored-by: Steve Chavez <[email protected]>

steve-chavez

LGTM! The error message now is clear as the one from PostgreSQL!

steve-chavez · 2022-12-20T18:42:24Z

@wolfgangwalther Do you have more input on this PR? I think it's looking good to merge.

wolfgangwalther · 2022-12-20T18:47:18Z

Don't have the time, right now, to dive into it, sorry. It's on my list - as well as a lot of other things ;). Feel free to merge, if you think it's good.

steve-chavez · 2022-12-20T19:23:43Z

It's on my list - as well as a lot of other things ;)

Oh, no problem!

Feel free to merge, if you think it's good.

I hesitate because I'm a bit lost on #2523 (which IIUC is the main goal of the PR ), so will let you review and merge this one.

aljungberg · 2023-01-02T15:36:11Z

I hesitate because I'm a bit lost on #2523 (which IIUC is the main goal of the PR ), so will let you review and merge this one.

Anything I can do to help explain or detail it for you? If you'd like I could discuss it over a video call.

steve-chavez · 2023-01-08T03:13:07Z

@aljungberg Sorry for the delay! Let's move forward with this one - we'll also need it for #2594, so I've just merged it.

(I'll start reviewing #2523 and I'll let you know any doubts I have there)

wolfgangwalther reviewed Oct 27, 2022

View reviewed changes

src/PostgREST/SchemaCache/Table.hs Outdated Show resolved Hide resolved

aljungberg force-pushed the json_to_recordset branch from 3174e0a to 04ef37b Compare October 28, 2022 12:34

wolfgangwalther reviewed Oct 28, 2022

View reviewed changes

src/PostgREST/Plan/Types.hs Outdated Show resolved Hide resolved

aljungberg force-pushed the json_to_recordset branch from 8e5006c to 769955f Compare October 28, 2022 14:35

aljungberg force-pushed the json_to_recordset branch from 769955f to a58a3e5 Compare October 31, 2022 15:58

steve-chavez reviewed Nov 1, 2022

View reviewed changes

src/PostgREST/Error.hs Outdated Show resolved Hide resolved

steve-chavez mentioned this pull request Nov 2, 2022

How to show the JSON schema of only one table #1421

Open

wolfgangwalther reviewed Nov 8, 2022

View reviewed changes

src/PostgREST/Plan/Types.hs Outdated Show resolved Hide resolved

aljungberg force-pushed the json_to_recordset branch 2 times, most recently from c3c3981 to 39e5120 Compare November 8, 2022 14:15

steve-chavez reviewed Nov 8, 2022

View reviewed changes

src/PostgREST/Plan.hs Outdated Show resolved Hide resolved

steve-chavez reviewed Nov 8, 2022

View reviewed changes

src/PostgREST/Plan.hs Outdated Show resolved Hide resolved

steve-chavez reviewed Nov 8, 2022

View reviewed changes

src/PostgREST/Query/QueryBuilder.hs Outdated Show resolved Hide resolved

wolfgangwalther reviewed Nov 9, 2022

View reviewed changes

src/PostgREST/Plan.hs Outdated Show resolved Hide resolved

src/PostgREST/Plan/Types.hs Outdated Show resolved Hide resolved

src/PostgREST/Plan/Types.hs Outdated Show resolved Hide resolved

src/PostgREST/Query/QueryBuilder.hs Outdated Show resolved Hide resolved

aljungberg mentioned this pull request Nov 25, 2022

Data Representations #2523

Closed

aljungberg force-pushed the json_to_recordset branch from 6f9f6b0 to 8e4f75b Compare November 25, 2022 17:29

aljungberg added 7 commits November 28, 2022 09:41

Eliminate mapStopOnLeft for just traverse.

59676ca

Refactor: moved pgFmtSelectFromJson to SqlFragment.

acc982e

Improvement: mutate columns kept as sets, always used as lists.

44b25f5

This left to unnecessary type conversions.

Drop now unused Ord for TypedField.

0e16f43

It was only used to put `TypedField` into sets which we no longer do after the previous commit.

Simplify as per PR feedback.

51c13dc

Dropped the unknown column hint after PR feedback.

0606c69

aljungberg force-pushed the json_to_recordset branch from a865b15 to 0606c69 Compare November 28, 2022 09:44

steve-chavez reviewed Nov 29, 2022

View reviewed changes

src/PostgREST/Error.hs Outdated Show resolved Hide resolved

Update src/PostgREST/Error.hs

43e30d1

Co-authored-by: Steve Chavez <[email protected]>

aljungberg force-pushed the json_to_recordset branch from f0370cf to 43e30d1 Compare November 30, 2022 11:16

feat: include table name in invalid column error

db77e84

aljungberg force-pushed the json_to_recordset branch from 078e98d to db77e84 Compare November 30, 2022 11:37

steve-chavez approved these changes Nov 30, 2022

View reviewed changes

steve-chavez requested a review from wolfgangwalther November 30, 2022 21:59

wolfgangwalther mentioned this pull request Dec 22, 2022

Can't use JSONB indexes in where clause due to to_jsonb call #2594

Closed

steve-chavez merged commit 43ad6d6 into PostgREST:main Jan 8, 2023

This was referenced Feb 3, 2023

Add HTTP status codes to PGRST errors PostgREST/postgrest-docs#590

Merged

Change inaccurate error codes to new ones #2648

Merged

steve-chavez mentioned this pull request Feb 11, 2023

Feature Request: Updating Columns to Default Values #1567

Closed

steve-chavez mentioned this pull request Jul 13, 2023

value too long for type character(1) after update to 11.1.0 #2861

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Json to recordset #2542

Json to recordset #2542

aljungberg commented Oct 27, 2022

wolfgangwalther commented Oct 27, 2022

aljungberg commented Oct 28, 2022 •

edited

Loading

wolfgangwalther commented Oct 28, 2022

aljungberg commented Oct 28, 2022

wolfgangwalther commented Oct 28, 2022

steve-chavez commented Nov 1, 2022

aljungberg commented Nov 1, 2022

wolfgangwalther commented Nov 1, 2022

wolfgangwalther commented Nov 1, 2022

steve-chavez commented Nov 1, 2022

aljungberg commented Nov 1, 2022

wolfgangwalther commented Nov 1, 2022

steve-chavez commented Nov 2, 2022

wolfgangwalther commented Nov 8, 2022

aljungberg commented Nov 8, 2022

wolfgangwalther left a comment

aljungberg commented Nov 25, 2022

aljungberg commented Nov 28, 2022

steve-chavez left a comment

steve-chavez commented Dec 20, 2022

wolfgangwalther commented Dec 20, 2022

steve-chavez commented Dec 20, 2022

aljungberg commented Jan 2, 2023

steve-chavez commented Jan 8, 2023

Json to recordset #2542

Json to recordset #2542

Conversation

aljungberg commented Oct 27, 2022

wolfgangwalther commented Oct 27, 2022

aljungberg commented Oct 28, 2022 • edited Loading

wolfgangwalther commented Oct 28, 2022

aljungberg commented Oct 28, 2022

wolfgangwalther commented Oct 28, 2022

steve-chavez commented Nov 1, 2022

aljungberg commented Nov 1, 2022

wolfgangwalther commented Nov 1, 2022

wolfgangwalther commented Nov 1, 2022

steve-chavez commented Nov 1, 2022

aljungberg commented Nov 1, 2022

wolfgangwalther commented Nov 1, 2022

steve-chavez commented Nov 2, 2022

wolfgangwalther commented Nov 8, 2022

aljungberg commented Nov 8, 2022

wolfgangwalther left a comment

Choose a reason for hiding this comment

aljungberg commented Nov 25, 2022

aljungberg commented Nov 28, 2022

steve-chavez left a comment

Choose a reason for hiding this comment

steve-chavez commented Dec 20, 2022

wolfgangwalther commented Dec 20, 2022

steve-chavez commented Dec 20, 2022

aljungberg commented Jan 2, 2023

steve-chavez commented Jan 8, 2023

aljungberg commented Oct 28, 2022 •

edited

Loading