-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: allow spread operators in to-many relationships #3640
base: main
Are you sure you want to change the base?
Conversation
1a02416
to
dd89c12
Compare
My approach right now is to generate this query for a to-many request: curl 'localhost:3000/clients?select=name,...projects(name,id)' SELECT "test"."clients"."name",
"clients_projects_1"."name",
"clients_projects_1"."id"
FROM "test"."clients"
LEFT JOIN LATERAL (
SELECT json_agg("projects_1"."name") AS "name",
json_agg("projects_1"."id") AS "id"
FROM "test"."projects" AS "projects_1"
WHERE "projects_1"."client_id" = "test"."clients"."id"
) AS "clients_projects_1" ON TRUE Right now this gives the expected result. But aggregates are not working correctly, because they are designed to be selected in the top query with a SELECT "test"."clients"."name",
json_agg("clients_projects_1"."name") AS "name",
json_agg("clients_projects_1"."id") AS "id"
FROM "test"."clients"
LEFT JOIN LATERAL (
SELECT "projects_1"."name",
"projects_1"."id"
FROM "test"."projects" AS "projects_1"
WHERE "projects_1"."client_id" = "test"."clients"."id"
) AS "clients_projects_1" ON TRUE
GROUP BY "test"."clients"."name" Not sure which one is better/easier right now... I'm thinking the latter. |
Having the json_agg in the outer query would make the query cleaner, imho. |
dd89c12
to
dce8597
Compare
Some caveats I encountered: Repeated values and orderDo we want to keep repeated values in the results? For example (not the best use case, just to illustrate): curl 'localhost:3000/project?select=name,...tasks(tasks:name,due_dates:due_date)' [
{
"name": "project 1",
"tasks": ["task 1", "task 2", "task 3", "task 4"],
"due_dates": [null, "2024-08-08", "2024-08-08", null]
}
] Here we're repeating Nested To-Many SpreadsI have a doubt on what to expect with nested to-many spreads. For example, on a non-nested to-many spread like this one: curl 'localhost:3000/entities?select=name,...child_entities(children:name)' We would expect: [
{"name": "entity 1", "children": ["child entity 1", "child entity 2"]},
{"name": "entity 2", "children": ["child entity 3"]},
"..."
] But what if we nest another to-many spread embedding with a new column to aggregate: curl 'localhost:3000/entities?select=name,...child_entities(children:name,...grandchild_entities(grandchildren:name))' I understand that we're hoisting all the aggregates to the top level, and not grouping by the intermediate columns ( [
{"name": "entity 1", "children": ["child entity 1", "child entity 2"], "grandchildren": ["grandchild entity 1", "grandchild entity 2", "..."]},
{"name": "entity 2", "children": ["child entity 3"], "grandchildren": []},
"..."
] This cannot be achieved by a simple SELECT "api"."entities"."name",
json_agg(DISTINCT "entities_child_entities_1"."children") AS "children",
json_agg(DISTINCT "entities_child_entities_1"."grandchildren") AS "grandchildren"
FROM "api"."entities"
LEFT JOIN LATERAL (
SELECT "child_entities_1"."name" AS "children",
"child_entities_grandchild_entities_2"."grandchildren" AS "grandchildren"
FROM "api"."child_entities" AS "child_entities_1"
LEFT JOIN LATERAL (
SELECT "grandchild_entities_2"."name" AS "grandchildren"
FROM "api"."grandchild_entities" AS "grandchild_entities_2"
WHERE "grandchild_entities_2"."parent_id" = "child_entities_1"."id"
) AS "child_entities_grandchild_entities_2" ON TRUE
WHERE "child_entities_1"."parent_id" = "api"."entities"."id"
) AS "entities_child_entities_1" ON TRUE
GROUP BY "api"."entities"."name"; If there is no sensible interpretation of the query, another option is to prohibit these intermediate columns altogether (aggregates like sum, avg, etc. should still be possible). |
6507878
to
bd93514
Compare
38abc0e
to
6e64707
Compare
OK, this is what I got implemented so far. For example, using the tables in our spec test:
curl 'localhost:3000/factories?select=name,...processes(processes:name,...supervisors(supervisors:name))' [
{
"name": "Factory C",
"processes": ["Process C1", "Process C2", "Process XX"],
"supervisors": ["Peter", "Peter", null]
},
{
"name": "Factory B",
"process": ["Process B1", "Process B1", "Process B2", "Process B2"],
"supervisors": ["Peter", "Sarah", "Mary", "John"]
},
{
"name": "Factory A",
"process": ["Process A1", "Process A2"],
"supervisors": ["Mary", "John"]
},
{
"name": "Factory D",
"process": [null],
"supervisors": [null]
}
]⏎ [
{
"name":"Factory C",
"processes":["Process C1", "Process C2", "Process XX"],
"supervisors":[{"name": "Peter"}, {"name": "Peter"}, null]},
{
"name":"Factory B",
"processes":["Process B1", "Process B1", "Process B2", "Process B2"],
"supervisors":[{"name": "Peter"}, {"name": "Sarah"}, {"name": "Mary"}, {"name": "John"}]},
{
"name":"Factory A",
"processes":["Process A1", "Process A2"],
"supervisors":[{"name": "Mary"}, {"name": "John"}]},
{
"name":"Factory D",
"processes":[null],
"supervisors":[null]
}
] As I mentioned in previous comments, some values will repeat, since we're grouping by the factory There's a problem when the embeddings have no values, as seen in the |
9002110
to
b3e5483
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feature should be ready for review now.
I'm leaving the Edit: Nvm. I figured that it should be OK to include that feature here too, although in different commits...
for DISTINCT
and NOT NULL
for another PR to keep it cleaner.
Here are some comments on the changes done:
67e6419
to
87a13ef
Compare
87a13ef
to
19466a8
Compare
7900716
to
02d8308
Compare
This is awesome guys, I was going to ask about aggregations, but just works! Sweet! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Terrific work with the test cases, very extensive. My head explodes, though.
Because we just discussed commit message / prefixes in another PR - what's your opinion on docs/feat commits? Should they be split like in this PR or do they belong together, i.e. was the idea to squash this?
I think they should go into the same feat:
commit. A feature without docs is not a feature.
It is expected to get ``null`` values in the resulting array. | ||
You can exclude them with :ref:`stripped_nulls`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked at the code, but I assume this with a special case handling this for the aggregation.
I don't think it's a good idea. This will lead to inconsistent results, because: Assume you have a regular aggregation, some spread embedding aggregation, a regular array and a json array - all with some null values in them. Some of them will be stripped, but others won't.
json(b)_strip_nulls
only strips nulls in objects for a reason, I don't think we should change that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had the discussion about it here: #3640 (comment)
This will lead to inconsistent results [...] Some of them will be stripped, but others won't.
From the convo above, I was also on the fence about it, but I figured that adding "this only works on to-many spreads" to the docs would clarify some things (I forgot to do that btw). Still, I agree, I think the inconsistency you mention is enough to look for an alternative (maybe another parameter in the header?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still, I agree, I think the inconsistency you mention is enough to look for an alternative (maybe another parameter in the header?).
I think filtering NULLs should be very explicit.
I don't remember seeing any tests with filters in the tests (but I didn't look again now).
Is something like this supposed to work?
get "/factories?select=factory:name,...processes(process:name)&processes.process=not.is.null"
And also any other filter on the embedding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And also any other filter on the embedding?
Yes, filters on the spread embed resource work, there's a couple of tests with them.
Is something like this supposed to work?
Yes, it will work on a single embed resource and won't include the null
values. But deeply nested resources could include nulls inside the array when the value is null
or when no embedded row is returned. This is a problem with the current implementation.
Hmm... with the fixed implementation (non-flattened arrays) this may not be a problem anymore, since it should return empty arrays instead of null
... but I'm not entirely sure, I need to check the new design of the resulting queries to verify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... with the fixed implementation (non-flattened arrays) this may not be a problem anymore, since it should return empty arrays instead of null...
Yes, AFAICT this is correct. Since the array_agg
is done in the same sub-query selection, it would return null
on a failed JOIN. But it returns [null]
when the JOIN is successful and the value is null
(which is what we want). So just the explicit filter should be needed here, not the header.
Makes sense, yes. I'll squash them to avoid problems when merging. |
02d8308
to
e969f91
Compare
} | ||
] | ||
|
||
The order of the values inside the resulting array is unspecified. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imho, this is unsatisfactory, it would basically make the feature unusable for me. This is because we snapshot test all our api responses and if we can't generate predictable output, then we can't use the feature. So ordering is very important.
Would something like this be hard to do?
get "/factories?select=factory:name,...processes(name)&processes.order=name"
(I hope I got the syntax right, this should be the regular "sort the embedded response" syntax, right?)
For the spread, this could then move the ORDER BY into the aggregate function call. This would only allow to specify a single ORDER BY for multiple spread aggregates - which I consider a good thing, because this would ensure the array items still match between arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would something like this be hard to do?
For the spread, this could then move the ORDER BY into the aggregate function call. This would only allow to specify a single ORDER BY for multiple spread aggregates - which I consider a good thing, because this would ensure the array items still match between arrays.
I don't think it'd be hard to do. Yes, the syntax is OK, internally it would need to treat every order done inside a to_many
spread as an array_agg
order for every column, instead of a subquery one.
Right now, the order as you mentioned in your example works: it orders the subquery and the aggregated columns will be sorted. But it's not guaranteed to behave the same way for other more complex cases, as mentioned in this convo. So yes, I'll implement the order by
in the aggregate here.
dca7c2d
to
daf47a5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some advances I'm making that are ready for reviewing:
- Non-flattend arrays as specified here feat: allow spread operators in to-many relationships #3640 (comment)
ORDER BY
inside the aggregate (with some caveats mentioned below)
The aggregates on the whole relationship are not yet implemented, e.g. ...to_many(count()).sum()
.
.. code-block:: json | ||
|
||
[ | ||
{ | ||
"first_name": "Quentin", | ||
"film_titles": [ | ||
"Reservoir Dogs", | ||
"Pulp Fiction" | ||
], | ||
"film_years": [ | ||
1992, | ||
1994 | ||
] | ||
} | ||
] | ||
|
||
Note that the field must be selected in the spread relationship for the order to work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the json_agg(col)
aggregate is done outside of the subquery selection (to avoid cases like json_agg(sum(col))
), we cannot order the json_agg
by columns that are not selected in that subquery. Here's the generated query for this example:
Query
WITH pgrst_source AS
-- Subquery for the current example
(SELECT "public"."directors"."first_name",
"directors_films_1"."film_titles",
"directors_films_1"."film_years"
FROM "public"."directors"
LEFT JOIN LATERAL
(SELECT json_agg("directors_films_1")::jsonb AS "directors_films_1",
COALESCE(
json_agg("directors_films_1"."film_titles" ORDER BY "directors_films_1"."film_years")
,'[]'
)::jsonb AS "film_titles",
COALESCE(
json_agg("directors_films_1"."film_years" ORDER BY "directors_films_1"."film_years")
,'[]'
)::jsonb AS "film_years"
FROM
(SELECT "films_1"."title" AS "film_titles",
"films_1"."year" AS "film_years"
FROM "public"."films" AS "films_1"
WHERE "films_1"."director_id" = "public"."directors"."id") AS "directors_films_1") AS "directors_films_1" ON TRUE
WHERE "public"."directors"."first_name" LIKE $1)
--
SELECT NULL::bigint AS total_result_set,
pg_catalog.count(_postgrest_t) AS page_total,
coalesce(json_agg(_postgrest_t), '[]') AS body,
nullif(current_setting('response.headers', TRUE), '') AS response_headers,
nullif(current_setting('response.status', TRUE), '') AS response_status,
'' AS response_inserted
FROM
(SELECT *
FROM pgrst_source) _postgrest_t
Maybe selecting all the columns in the non-aggregated subquery could be an alternative? (computed columns still won't work, I think).
Just noticed there's also an issue when using aliases in the columns. In the example, order=film_years
(the alias) works, but order=year
does not. This needs to be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any fundamental reason this can't become SELECT jsonb_agg(... ORDER BY ...) FROM public.films WHERE ..
, i.e. without the subquery in FROM
?
Edit: Ah, this, I think:
(to avoid cases like json_agg(col.sum()))
Not sure whether I understand that part, yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(to avoid cases like json_agg(col.sum()))
Not sure whether I understand that part, yet.
No, I don't. I'm especially confused by the mixed syntax of SQL and PostgREST-request here. Why exactly did you decide to use the subquery?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: Ah, this, I think:
Yes. For example, ...films(years.max())
, would try to do this:
SELECT json_agg(max(years)) FROM public.films WHERE ...
Which returns ERROR: calls to aggregate functions cannot be nested
.
Edit: Fix syntax 🤦
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example,
...films(max(years))
, would try to do this:
This time the syntax was mixed again, but the other way around :D
So, I guess you mean: ...films(years.max())
.
Ok, I see that now, yes. It makes sense to treat the spread as another query layer, so I guess the requirement to have the columns selected for ordering is OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not liking the drawbacks of having to include the order inside the select
.
-
Having the order in
films(name)&order=year
works, but just by adding...
like...films(name)&order=year
, then it breaks. -
We lose top-level ordering, like:
films(name,technical_specs(runtime))&films.order=technical_specs(runtime)
-
In general, having an order exception for spreads is bad UX.
The solution I'm working right now is to include the order
in the subquery, and then use it for the ordering, e.g. ...films(title)&order=year
:
select json_agg(x.title order by year)
from (
select title,
year
from films
) x;
It needs to take into consideration cases where the order could already be selected and be careful not to collide the aliases, so it's not so straight forward there. If there's any drawback for this approach, let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The solution I'm working right now is to include the order in the subquery, and then use it for the ordering
This is now complete, I added it in a separate commit for better reviewing.
src/PostgREST/Query/QueryBuilder.hs
Outdated
Spread{rsSpreadSel, rsAggAlias} -> | ||
if relSpread == Just ToManySpread then | ||
let | ||
selection = selectJsonArray <> (if null rsSpreadSel then mempty else ", ") <> intercalateSnippet ", " (pgFmtSpreadSelectItem True rsAggAlias order <$> rsSpreadSel) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still need to "SELECT json_agg(<subquery_alias>) AS "<subquery_alias>" to use it for not.is.null
or !inner
conditions. Can be seen how it's added in the previous comment's example.
Does it make sense to leave this out for this PR? This seems already complex enough :) |
I wanted to include it since it would solve what's mentioned in the original issue #3041 (under Spread on Count). But yes, I would consider it a separate feature, we could leave it for another PR and don't let this one close the issue completely. |
I looked at the issue again and I think we need to take the following into account:
and
We mostly discussed the second case, in which I argued that I expect an array of counts as a return, matching the array of supervisors. But we didn't really discuss the first case, which seems to be the case in the issue. I think the first case should not return a single item array, but indeed the overall count. The basic idea would be: We use array aggregation for x2m embeddings. But once we aggregate inside this embedding without any GROUP BY columns, then we don't have an x2m embedding anymore, but an x2o. The "relation" we are embedding is guaranteed to return only one row. Taking this into account I wonder whether we actually need the more complex syntax |
Yes, that's a nice approach, I agree. There wouldn't be a need for the more complex syntax anymore. I haven't checked yet but there may be some caveats with nested spreads and this implementation. I'll let you know if I find something along the way. |
Thinking about the The motivation comes from the comment on #3041 (comment). But I think the main use case is just forming an array of one column and running aggregates on them, for this we wouldn't need to worry about ORDER. Perhaps we could leave multiple columns for later? |
daf47a5
to
c3ee0e2
Compare
c3ee0e2
to
4600a2b
Compare
The ordering is now complete and ready for review. If everything's OK there would be no need to leave for later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only looked at the docs and the AggregateFunctionsSpec.hs file. Still have a few questions, let's discuss those first, before I continue with the other test file.
Maybe not everything we discussed is implemented or maybe I missed something else, not sure.
it "supports aggregates inside nested to-one spread relationships" $ do | ||
get "/supervisors?select=name,...processes(...process_costs(cost.sum()))&order=name" `shouldRespondWith` | ||
[json|[ | ||
{"supervisor": 1, "supervisor_count": 2}, | ||
{"supervisor": 2, "supervisor_count": 2}, | ||
{"supervisor": 3, "supervisor_count": 3}, | ||
{"supervisor": 4, "supervisor_count": 1}]|] | ||
{"name":"Jane","sum":[null]}, | ||
{"name":"John","sum":[270.00]}, | ||
{"name":"Mary","sum":[220.00]}, | ||
{"name":"Peter","sum":[290.00]}, | ||
{"name":"Sarah","sum":[180.00]}]|] | ||
{ matchHeaders = [matchContentTypeJson] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The [null]
for Jane here appears odd to me. Jane doesn't supervise any processes, right? This means there are no rows joined - I would expect null
(without array!) here, I think.
Or maybe an empty array.
But [null]
indicates to me: "Jane has processes to supervise, but none of them have any cost associated.".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree that it should show an empty array if there's no processes. But this seems to be a caveat with to-one spreads.
For example, a normal embedding returns an empty array as expected:
curl 'localhost:3000/supervisors?select=name,processes(process_costs(cost.sum()))&name=eq.Jane'
[{"name":"Jane","processes":[]}]
But the to-one spread does not:
curl 'localhost:3000/supervisors?select=name,processes(...process_costs(cost.sum()))&name=eq.Jane'
[{"name":"Jane","processes":[{"sum": null}]}]
The to-many spread is based on the data returned before, so it kinda makes sense for it to return [null]
:
curl 'localhost:3000/supervisors?select=name,...processes(...process_costs(cost.sum()))&name=eq.Jane'
[{"name":"Jane","sum":[null]}]
If no to-one spread is done, then the to-many spread returns an empty array as expected:
curl 'localhost:3000/supervisors?select=name,...processes(process_costs(cost.sum()))&name=eq.Jane'
[{"name":"Jane","process_costs":[]}]
I think the issue is with to-one spreads, it needs to be fixed at that level (not in this PR). For example, if we add a column to GROUP BY
in the to-one spread, then we get the expected empty array:
curl 'localhost:3000/supervisors?select=name,processes(...process_costs(process_id,cost.sum()))&name=eq.Jane'
[{"name":"Jane","processes":[]}]
get "/operators?select=name,...processes(id,...factories(...factory_buildings(size.sum())))&order=name" `shouldRespondWith` | ||
[json|[ | ||
{"factory": 1, "process_costs_count": 2}, | ||
{"factory": 2, "process_costs_count": 4}, | ||
{"factory": 3, "process_costs_count": 1}]|] | ||
{"name":"Alfred","id":[6, 7],"sum":[[240], [240]]}, | ||
{"name":"Anne","id":[1, 2, 4],"sum":[[350], [350], [170]]}, | ||
{"name":"Jeff","id":[2, 3, 4, 6],"sum":[[350], [170], [170], [240]]}, | ||
{"name":"Liz","id":[],"sum":[]}, | ||
{"name":"Louis","id":[1, 2],"sum":[[350], [350]]}]|] | ||
{ matchHeaders = [matchContentTypeJson] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The empty arrays for Liz seem to be correct for me, because Liz does not operate any processes.
get "/factories?select=factory:name,...processes(processes_count:count())&order=name" `shouldRespondWith` | ||
[json|[ | ||
{"factory":"Factory A","processes_count":[2]}, | ||
{"factory":"Factory B","processes_count":[2]}, | ||
{"factory":"Factory C","processes_count":[4]}, | ||
{"factory":"Factory D","processes_count":[0]}]|] | ||
{ matchHeaders = [matchContentTypeJson] } | ||
it "works alongside other columns in the embedded resource" $ do | ||
get "/factories?select=name,...processes(category_id,count())&order=name" `shouldRespondWith` | ||
[json|[ | ||
{"name":"Factory A","category_id":[1, 2],"count":[1, 1]}, | ||
{"name":"Factory B","category_id":[1],"count":[2]}, | ||
{"name":"Factory C","category_id":[2],"count":[4]}, | ||
{"name":"Factory D","category_id":[],"count":[]}]|] | ||
{ matchHeaders = [matchContentTypeJson] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note the asymmetry here between the count. I think the first example should also return []
instead of [0]
, even if possibly slightly unintuitive.
The general idea: I can't spread something that doesn't exist. Since there is no process for factory D, I can't spread anything. The "count" conceptually never happened, so I can't return a 0 for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, this may be related with the issue mentioned above, the aggregates may be the actual problem... probably.
Again, using the example of the non-spread embed:
curl 'localhost:3000/factories?select=factory:name,processes(processes_count:count())&id=eq.4'
[{"factory":"Factory D","processes":[{"processes_count": 0}]}]
The to-many spread takes the above response into consideration, so it returns the [0]
. Now, if we add a column to GROUP BY
, then we get an expected empty array.
curl 'localhost:3000/factories?select=factory:name,processes(category_id,processes_count:count())&id=eq.4'
[{"factory":"Factory D","processes":[]}]
So, the inconsistency seems to be at the aggregate level and it should be fixed there.
it "works by itself in the embedded resource" $ do | ||
get "/supervisors?select=name,...processes(count())&order=name" `shouldRespondWith` | ||
[json|[ | ||
{"name":"Jane","count":[0]}, | ||
{"name":"John","count":[2]}, | ||
{"name":"Mary","count":[2]}, | ||
{"name":"Peter","count":[3]}, | ||
{"name":"Sarah","count":[1]}]|] | ||
{ matchHeaders = [matchContentTypeJson] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could have probably commented it on any of the aggregates before, too - shouldn't this return a count without array?
(Maybe that even resolves the concerns I had in the comments before about NULL and 0.)
The idea was "if an embedding only has aggregates in it, it's not considered to-many anymore, but to-one", right? So this should just return count: 0, count: 2
etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, WIP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, the new tests are passing now.
it "prevents the use of aggregates on to-many spread embeds" $ | ||
get "/factories?select=...processes(id.count())" `shouldRespondWith` | ||
[json|{ | ||
"hint":null, | ||
"details":null, | ||
"code":"PGRST123", | ||
"message":"Use of aggregate functions is not allowed" | ||
}|] | ||
{ matchStatus = 400 | ||
, matchHeaders = [matchContentTypeJson] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Further up we have examples of this:
get "/factories?select=name,...processes(count())&order=name"
I don't understand what the difference is, so that the case here is forbidden.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this is for when db-aggregates-enabled is not set to true. The to-one spread used to bypass this restriction before it was fixed, I added the to-many test for completion.
Ah, I guess I missed the fact that you only changed the ordering stuff, not anything else :D |
Yup 😄, the "do not wrap into arrays if there's no GROUP BY columns" is still a WIP. |
…pped in an array (treated as a to-one spread)
get "/supervisors?select=supervisor:id,...processes(processes:name,...operators(operators_count:count()))&order=id" `shouldRespondWith` | ||
[json|[ | ||
{"supervisor":1,"processes":["Process A1", "Process B2"],"operators_count":[2, 2]}, | ||
{"supervisor":2,"processes":["Process A2", "Process B2"],"operators_count":[3, 2]}, | ||
{"supervisor":3,"processes":["Process B1", "Process C1", "Process C2"],"operators_count":[1, 0, 2]}, | ||
{"supervisor":4,"processes":["Process B1"],"operators_count":[1]}, | ||
{"supervisor":5,"processes":[],"operators_count":[]}]|] | ||
{ matchHeaders = [matchContentTypeJson] } | ||
get "/supervisors?select=supervisor:id,...processes(processes:name,...operators(operators_count:count()))&processes.order=name.desc&order=id" `shouldRespondWith` | ||
[json|[ | ||
{"supervisor":1,"processes":["Process B2", "Process A1"],"operators_count":[2, 2]}, | ||
{"supervisor":2,"processes":["Process B2", "Process A2"],"operators_count":[2, 3]}, | ||
{"supervisor":3,"processes":["Process C2", "Process C1", "Process B1"],"operators_count":[2, 0, 1]}, | ||
{"supervisor":4,"processes":["Process B1"],"operators_count":[1]}, | ||
{"supervisor":5,"processes":[],"operators_count":[]}]|] | ||
{ matchHeaders = [matchContentTypeJson] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be expected. Processes and operators have a to-many spread. Since there's only count()
in the operators side then it doesn't wrap it in an extra array. In other words, I think it's expected that operators_count
is [2,2]
instead of [[2],[2]]
. If the processes weren't spread, then the results can be seen more clearly:
curl 'localhost:3000/supervisors?select=supervisor:id,processes(processes:name,...operators(operators_count:count()))&order=id'
[{"supervisor":1,"processes":[{"processes": "Process A1", "operators_count": 2}, {"processes": "Process B2", "operators_count": 2}]},
{"supervisor":2,"processes":[{"processes": "Process A2", "operators_count": 3}, {"processes": "Process B2", "operators_count": 2}]},
{"supervisor":3,"processes":[{"processes": "Process B1", "operators_count": 1}, {"processes": "Process C1", "operators_count": 0}, {"processes": "Process C2", "operators_count": 2}]},
{"supervisor":4,"processes":[{"processes": "Process B1", "operators_count": 1}]},
{"supervisor":5,"processes":[]}]
Also added an order test to check if it was working correctly.
NVM, it should be complete now. |
Codecov keeps complaining but I don' think I can appease it any further. Almost all the % is due to the new types. |
Closes #3041