-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Customize URL structure - Confusion about /rpc/ prefix, move stored procedures to root level #1086
Comments
That's right, and in fact it's the primary motivation. A GET request is considered safe by the HTTP spec, meaning it should not modify server state. Call the RPC via POST if you want to be able to update the database. (The only time that this is kind of annoying is when the RPC wants to simply log the GET request internally in the db. The read-only restriction prevents it from taking that harmless action. As a workaround you can use nginx on top of the API server to do the logging.) |
I understand that it fits and conforms to the HTTP spec defining a GET as safe and idempotent. do we apply the same reasoning on an RPC call? I see a procedure differently than a view/table. A function is an isolated closed-system which can do various things, in fact I am using procedures to do things I am not able to do with only tables and views. and also: a (transactional) procedure in psql provides its own isolation levels and security mechanisms. so it's maybe a question of how we want to see and treat a procedure. I told you my way of how I see it. an example regarding the limitations: the |
I don't see why not, since some RPCs provide information, and others modify things. It's useful to preserve HTTP semantics there as well as anywhere else.
Hm, I can see how that's troublesome. Sucks that the browser user-agent has always been too underpowered to support links with arbitrary verbs. In your email you could link users to a page with javascript inside that makes the API call to the RPC.
...and new problems to appear for others. If we were to remove the safety guarantees from GET I feel like it lowers the standards of the project, allowing people to build APIs that violate valuable conventions. Maybe the best way to think of it is that PostgREST creates APIs for programs to use, and that it's not always appropriate for direct consumption by a user. Sometimes you have to build a UI layer on top of the API. |
maybe I should point out more clearly how I see it: the HTTP spec conformity is a good thing, because views/tables are just bare, naked pieces of elements you can interact with (a sort of gate to the underlying db). by using HTTP in a restful manner we define what a view can do for us. (I hope I could explain myself) a function on the other hand is something isolated. a closed system. like a mathmatical function it expects an input and provides an output. modification is a must have for this to work. now let's say we apply those HTTP conformity to procedures as well. what is the outcome? we treat functions the same way as if they were an element without its own functionality, like a view is kind of. so with all the functionality a procedure in psql comes with: isolation levels, security definers, and volatility, we should treat those functionalities as a second layer on top of the HTTP layer. In fact, when I think about it, we dont apply HTTP in a restful way on procedures like we do it on views/tables. since there is no DELETE or PATCH available for procedures, it is somehow a half baked thing to say that we try to comply with the HTTP specs for one part (a GET should be safe and idempotent) but the other above mentioned two verbs are missing completely because they would make no sense on a function. which is true. so I say, let the function define its own rules (you can make them read only, write only, read writable, isolated, secure, non volatile, etc.) and treat them for what they are: a remote procedure call. |
you mentioned few good points, which really point out the limitations I see in this and why I have such a trouble:
my application has no web UI or website. it is a pure cli application. yes I can use third party tools and eg. let the user put the activation code inside my cli app and use libs to make a POST request. I see no problem in finding workarounds to this. for me it is more about the principle here. your second mentioned "limitations" as I see it:
again I can do workarounds here but I really have a hard time to understand why we think this way. by no means I see this as a wrong decision or as a problem per se. it's just that we somehow hide or handwave away the fact that psql is able to do this for us and we would imo not violate valuable conventions. I am not sure to say this but I think people would welcome this. I guess I see your point and reasoning but I can't understand it. |
we then would reduce the HTTP verbs on procedures to the point where we use two verbs, only for defining how we want to pass arguments, like in old days where you had the choice of either using GET with all your params as query strings, or a POST action on a form. this way we give people the freedom to do whatever they want to do with a procedure and let them choose how they want to pass their arguments to it. |
Except it's not. It can modify the state of the database. If this is acceptable behavior then a user may call it with POST. Otherwise use GET and enjoy the protection elegantly provided by read-only transactions.
Many people resort to RPCs in PostgREST in order to provide views that require more sophisticated queries than are currently possible via the URL query building functionality. The rpc-as-view thing is a common use case.
Sorry, as you mentioned it's impossible to add those verbs for functions so we're doing the best we can in providing the verbs that do make sense, and providing them correctly. IMHO abusing a GET request does not improve the situation. I think maybe you've run into a limitation of href links and/or are conflating a program-level API with a user interface and now seek to introduce a shortcut to HTTP itself that will weaken the assurances enjoyed by other PostgREST users. I hope this doesn't come across as too dogmatic, like I'm hypnotized by RFC 2616, but I see value in honoring the meaning of the protocol. I'm curious what other people have to say in this issue. Edit: I appreciate you debating my position on this though. 😄 Most design decisions on this project have gone through prolonged debate, and perhaps other people have more points to add to this discussion. |
I knew this would come. @begriffs I highly appreciate your effort for making things right and provide us with this great piece of software. by no means I am trying to find a solution to my own problem here. as I've said I can find ways to do a workaround to this. I already have one workaround which works and it's ok and I would stick with it. the thing is just, the more I thought about this (when I was facing this issue) the more it made sense to me to start this discussion because I somehow felt like we would treat functions in psql in a "wrong" way. now to your point:
what I mean by "isolated, closed system" is that a procedure comes with functionalities which makes it behave exactly how you want and what it should do (rather views which are not able to do so, so we outsource that on to the HTTP layer, restful, spec'ed). let me give you an example: one way to do this is eg. let the user decide what he expects the function to do:
now it shouldn't make any difference to the user if it was called with a GET or a POST. I mean, what is a user supposed to expect on a GET? that the function shouldn't delete at all? it is the whole purpose of this function to delete something.
sure, no problem with that. I do it here and there, too. now, let's say it is a function that exactly does this, mimicing a view (with some more features a normal view can't give you). then I say:
I made the transaction read-only. a POST in what ever form it comes to manipulate the data (wonder how, because it is just a |
maybe worth mentioning:
.. that these functions are not normal restful things. The functions can have arbitrary and surprising behavior, not the standard “post creates a resource” thing that users expect from the other routes. so we (you) already realized that an RPC is something different. because it is purely dependent on the user what it does. by saying, we serve you with safetyness on a GET verb is imo restricting a user in his abilities to make things with a procedure. of course a POST to that very same function will make it work, but then I wonder why we restrict it this way. what makes a GET different than a POST. can you give me one example to this? maybe the rpc-as-view you mentioned? because when I look at the different given examples in our issues list, for example:
now how would a GET differ from a POST to all stored procedures you will find in the above listed issues? actually, I just looked into my code (hoped to find a good example for an rpc-as-view misuse of a procedure), and I couldn't find one. so no, I don't use a single one as a view replacement. maybe you can provide one example. |
ok so to summarize my points:
it somehow confuses me as you can see. my proposed solution:
thank you for your input on this @begriffs |
@omani if you want to hack your way out of the RPC in regards to GET, you can easily have the proxy rewrite the request verb and everybody is happy. A function as @begriffs said is not free to do whatever it wants and yet "lie" that it's idempotent/pure. You mentioned the docs in regard to the RPC thing, and indeed there is a conflict between what we are saying here and what the docs say. I think the docs are wrong and the /rpc prefix is a legacy that ideally should be eliminated. The outside world (users of the api) should be "in the dark" about what exactly is the implementation of the endpoint they are calling. Like i start out with a view |
The way I see it is that PostgREST must work with the medium, for exposing a SQL feature we always seek to comply with HTTP semantics, for example, though in SQL you can do a I agree with Ruslan that the "Resources are not storage items (or, at least, they aren’t always equivalent to some storage item on the back-end). The same resource state can be overlayed by multiple resources, just as an XML document can be represented as a sequence of bytes or a tree of individually addressable nodes. Likewise, a single resource can be the equivalent of a database stored procedure, with the power to abstract state changes over any number of storage items". Also, about DELETE/PATCH/PUT not working on stored procedures, from this comment: "HTTP operations are generic: they are allowed or not, per resource, but they are always valid." So as I see it, we can move stored procedures to root level and only allow GET/POST on them. Since a function and view/table can have the same name under the same schema in PostgreSQL, the downside is that we'd have to add some logic to differentiate them. |
@steve-chavez I like your proposal. I think both under root would be a good thing. |
Assume a table and function with the same name. The logic to differentiate them, can not depend on any type of custom request parameter (that makes the api-user aware of whats happening), because that would not be any better than / any different from having the Let's assume the very simple (simple as in function definition, but actually the one with the fewest hints available, so the hardest to make a decision about) case of the following: CREATE TABLE accounts (...);
CREATE FUNCTION accounts () RETURNS SETOF accounts AS $$
SELECT * FROM accounts WHERE my_custom_filter
$$ LANGUAGE SQL STABLE; No function arguments, same return type as the table itself. No hints. Since the decision to choose function or table is not about the intent of the api-user, we must guess the intent of the api-developer. Imho, there is only one valid assumption: The intent must be to replace one of the request methods to the "regular" table endpoint So in this case PostgREST should choose to call the function for a Now imagine a function with the same name, but marked as Now this can be extended to allow replacing only some of the mutating methods, e.g. by using CREATE FUNCTION accounts () RETURNS SETOF accounts AS $$
[...]
$$ LANGUAGE SQL SET request.method.handler_for = 'POST,PUT'; This would only replace Now the only thing remaining is: How do we handle replacing e.g. CREATE TABLE accounts (...);
CREATE FUNCTION accounts () RETURNS SETOF accounts AS $$
SELECT * FROM accounts WHERE my_custom_filter
$$ LANGUAGE SQL STABLE;
CREATE FUNCTION accounts (TEXT) RETURNS SETOF accounts AS $$
[...] -- $1 will hold the request body in TEXT format now
$$ LANGUAGE SQL STABLE SET request.method.handler_for = 'PATCH'; Since this argument is unnamed and unnamed arguments are not used at all so far by PostgREST for RPCs, it can be parsed and identified by the schema cache as the argument to pass the request body on - while at the same time not creating any confusion with query parameters or parsed json keys from the body, which could still co-exist. So far, so good, I guess. Thinking this through I imagine this providing exciting new possibilities in addition to the already built-in postgres features like views, triggers and rules. I think it would be a lot more useful even, with a way to pass in / use the "conditions for horizontal filtering" inside or before the function call. Otherwise calls to A more than vague idea would be to maybe run a Admittedly, this now warrants the "enhancement" label a bit more than the "hygiene" label, I guess ;). |
ran into this whith the GET. My use case is having a protected function and a cached table, where the function first checks the table for return. If there's data to return and within the TTL, it simply returns. If theres no data, then it triggers another function which queries the data and return sthat. Finally, if there is data and it's stale, then we delete the cached data, and update the TTL and return a new query. As far as the client is concerned, this is all invisible and it's just a GET request. The API's i'm confirming to are simply get's for geojson so no real control on those APIs. |
On #1582, we came up with the idea to represent media types as domain aliases, like Since this is about URL paths, I think we can do the same to customize them.
It might even be possible to encode a url parameter inside the name
Which then can be obtained on the function or view as This would allow to override the |
I think the above is pretty cool as we can specify the more "conventional" REST routes in a declarative way like: CREATE VIEW "./projects/:id" AS
SELECT *
FROM projects
WHERE id = current_setting('request.path.id`, false); Notice the We could even fail before the SQL query is ran, at our planning level, since the path in the name would allow us to do a static check. So the above allows us to make a parameter obligatory. On #915 we discussed the idea of enforcing a range for allowing GROUP BY + aggregates. We could define that as: CREATE VIEW "./projects/:start/:end" AS
SELECT *
FROM projects
WHERE id BETWEEN current_setting('request.path.start', false) AND current_setting('request.path.end`, false); And allow doing: GET /projects/1/1000?select=aggregate(..)&group_by=... Also related to #1710. One thing good about this way is that we don't need to provide all the query parameters as GUCs, only the path parameters defined in the name of the db object. One drawback of this approach is that pg identifiers have a max character length: create or replace function "/projects/:id/clients/:client_id/users/:user_id/subscribers/:subscriber_id"() returns text as $$ select ''::text; $$ language sql;
NOTICE: identifier "/projects/:id/clients/:client_id/users/:user_id/subscribers/:subscriber_id" will be truncated to "/projects/:id/clients/:client_id/users/:user_id/subscribers/:su"
CREATE FUNCTION Still useful though and I recall this was an issue pg hackers were interested in solving. |
That's neat. We gotta be careful, though: If one transaction / request sets postgres=# SELECT current_setting('request.path.id', false);
ERROR: unrecognized configuration parameter "request.path.id"
postgres=# BEGIN;
BEGIN
postgres=*# SET LOCAL request.path.id TO 'test';
SET
postgres=*# SELECT current_setting('request.path.id', false);
current_setting
-----------------
test
(1 row)
postgres=*# ROLLBACK;
ROLLBACK
postgres=# SELECT current_setting('request.path.id', false);
current_setting
-----------------
(1 row) If the So we should not encourage the use of CREATE VIEW "./projects/:id" AS
SELECT *
FROM projects
WHERE id = nullif(current_setting('request.path.id`, true), ''); This would not throw errors without an id, but at least return consistent results. |
So apparently #561 was made due to the issue of not being able to treat read-only dbs (replica slaves) in a proper way.
another reported issue is #547. OP demonstrates a simple curl to localhost on a replicated psql server. it fails due to having the transaction mode set to read-write. which makes sense, since it was requested on a read-only slave.
impact
the above merge practically makes any rpc which wants to modify the db unusable with a GET request even if it is volatile by default (non stable, non immutable). as seen in https://github.com/begriffs/postgrest/blob/108f3cd651a448fad065faf81a2090cee8a8d9fb/src/PostgREST/ApiRequest.hs#L169.
my proposed solution
if false, use read-write transaction. this enabled the GET verb on rpc calls which are volatile.
request for comments.
The text was updated successfully, but these errors were encountered: