Allow strings in `field/2` #4384

greg-rychlewski · 2024-02-24T04:34:20Z

I'm worried I'm missing something obvious. But I think with these 2 simple changes to the schema functions we can allow string names in field/2.

The __schema__(:type, field)__ functions will now catch string names and get the right type. And the __schema__(:field_source, field)__ functions will ensure the field is converted back into an atom during normalization so that the adapters don't have to change.

josevalim · 2024-02-24T08:36:22Z

Interesting. What happens if there is no schema?

greg-rychlewski · 2024-02-24T08:42:08Z

Ah good catch, I knew I forgot something. Then it fails but it can be fixed by changing the guard in the adapter here: https://github.com/elixir-ecto/ecto_sql/blob/master/lib/ecto/adapters/postgres/connection.ex#L835 and here https://github.com/elixir-ecto/ecto_sql/blob/master/lib/ecto/adapters/postgres/connection.ex#L841

josevalim · 2024-02-24T08:49:04Z

I think if we want to go down this route, we should rather go in the other direction. Have field_source return strings and then send it all the way down to the adapter as strings. If there is no field source, we normalize it to strings on the spot?

As long as it is a small change on the adapter side, we should be good. :)

josevalim · 2024-02-24T08:49:39Z

Although converting to strings on adapters such as etso and ecto_mnesia will likely cause other problems.

greg-rychlewski · 2024-02-24T08:58:56Z

It makes the inspection a bit weird too if it's normalized to string. I had to make a special condition to make the inspect look normal. I just pushed a change

josevalim · 2024-02-24T15:22:54Z

Thanks @greg-rychlewski. I am honestly a bit unsure. It feels we are at the worst of both words: sometimes they are atom, sometimes they are strings. What do you think about:

If there is a schema, we always normalize them to atoms in the planner (as long as the schema module has been loaded because we called a previous operation in it, String.to_existing_atom should be enough).
For schemaless queries, we always keep them as strings.

Perhaps we can start with step 2 for now (and raise if a string is given to a schema source)?

greg-rychlewski · 2024-02-24T16:29:33Z

That sounds like a good compromise. I'll update the PR.

There was one thing I was having trouble figuring out for inspect. I believe the dot notation for the fields does not play nice with strings. For example, atoms produce this

query = from p in Post, select: field(p, :visit)

IO.inspect query
# "from p0 in Inspect.Post, select: p0.visit"

But strings produce this

query = from p in Post, select: field(p, "visit")

IO.inspect query
# "from p0 in Inspect.Post, select: p0 . \"visit\"()"

I was previously handling it by transforming it to field inside of the inspect. But if we will normalize to string then it could be confusing if there is an error during planning and the user didn't use field/2. The only other thing I could think is to not use the dot notation, but then the inconsistency is probably confusing.

Do you know if there is anything I can do to make the string play nice with the dot notation?

josevalim · 2024-02-24T17:51:12Z

My only suggestion is to indeed convert it back to field(..., ...) when pretty printing/inspecting.

greg-rychlewski · 2024-02-24T18:23:48Z

I had one idea. Would you be against tagging the field/2 stuff all the way down to the adapter and them have the source resolved in the adapter? That way we could normalize everything given to field/2 to string and all the non field/2 stuff will stay as atom.

The reason I'm suggesting this is because when I tried normalizing schemaless sources to string it started to affect a lot of things I didn't expect like subqueries and joins. Doing it this way would minimize the blast radius and still have some kind of internal consistency. And either way we are asking adapters to change something.

josevalim · 2024-02-24T19:58:15Z

To be clear, you are saying this:

field/2 with a string arrives as a field in the AST all the way down to the adapter? That sounds good to me.

But we still need to decide what to do with strings and schemas. We have to options:

We say string fields are not checked, typed, or converted in any way (akin to a fragment)
We say string fields are validated and normalized accordingly

greg-rychlewski · 2024-02-25T02:32:09Z

I'm starting to feel like I completely misunderstood one of your earlier replies. When you said this

I am honestly a bit unsure. It feels we are at the worst of both words: sometimes they are atom, sometimes they are strings. What do you think about:

If there is a schema, we always normalize them to atoms in the planner (as long as the schema module has been loaded because we called a previous operation in it, String.to_existing_atom should be enough).

For schemaless queries, we always keep them as strings.

Were you talking about field/2 specifically? I read it as any field notation, for example p.title as well but now I think I was mistaken.

josevalim · 2024-02-25T09:14:21Z

At the time I did not mean field/2 but later on I certainly thought it should be only about field/2. :)

greg-rychlewski · 2024-02-27T18:58:12Z

Let me know what you think of this:

field/2 with string name

always normalizes its name as an atom when dealing with schemas
keeps its name as string when not dealing with schemas
`has its type validated when dealing with schemas. it's handled by adding guards to the reflection functions here:

ecto/lib/ecto/schema.ex

Line 698 in 4876779

{args, when_expr, body} ->

ecto/lib/ecto/schema.ex

Line 2327 in 4876779

types_quoted =

field/2 with atom name

no changes. though i'm not sure if you'd prefer to normalize its name to string when dealing with schemaless sources

greg-rychlewski · 2024-02-27T19:37:33Z

we could also make it so that field/2 with a string always emits a string and field/2 with an atom always emits an atom. for both schema/schemaless.

josevalim · 2024-02-28T10:46:45Z

My thinking would be that, if you pass a string, then it is kept as is and it would always be the field source. We never transform or manipulate it in any case.

greg-rychlewski · 2024-02-28T11:00:35Z

Do you think it would still be ok to check its type against the schema? For instance, if someone wanted to do this

field = get_user_input_as_string(...)
query = from p in Post, where field(p, ^field) == ^"some_string"

To work properly the parameter would have to be cast to the type of field(p, ^field).

Or alternatively they could wrap field(p, ^field) inside of type/2 but then it's kind of the asme problem where they have to dynamically find the type they want given a string field name.

josevalim · 2024-02-28T11:03:41Z

It really depends which problem we want to solve. Do we want to make it easier for people receiving parameters to create custom queries or do we want to make it possible for you to dynamically query databases without generating tons of atoms?

greg-rychlewski · 2024-02-28T11:21:39Z

I would say this is the bigger issue right now:

make it possible for you to dynamically query databases without generating tons of atoms

In which case I think you are saying this is not an issue currently for schemas. Because if you are using schemas then you know up front which strings are allowed and can handle it without this change?

If my above statement is not mistaken, then I think even though it's a bigger problem, it might be a bit confusing why we are not solving this problem at the same time if it's not too hard to do so:

make it easier for people receiving parameters to create custom queries

greg-rychlewski · 2024-02-28T11:28:08Z

I think I get it though. Basically it's about staying consistent using atoms when referring to schema fields right? If so it's ok to me. I could change the PR to raise if string is used for schemas.

josevalim · 2024-02-28T11:28:49Z

The problem I think is that we have two acceptable semantics. One is to make it a direct field source access and the other is to make it automatically cast. When we pick one, we exclude the other. For schemaless fields they are the same, however.

josevalim · 2024-02-28T11:31:11Z

In which case I think you are saying this is not an issue currently for schemas. Because if you are using schemas then you know up front which strings are allowed and can handle it without this change?

Yes, you could do String.to_existing_atom, althogh we can argue that we could have a more convenient API. The question is if it should be field or something else.

josevalim · 2024-02-28T11:43:59Z

For example, just for brainstorming purposes, it could be called field! or cast_field or something. Given the field may not exist and the query my fail.

greg-rychlewski · 2024-02-28T13:02:28Z

The question is if it should be field or something else.

One reason I can think to keep it as field/2 is because I'm seeing other areas that users might want opened up. For example, if someone has a need for field/2 with string then probably someone would also have a need for map/2 with a string list.

If we go down the route of making a new function for each one it might be cumbersome for us to maintain and for users to use. So it might be worthwhile to lax the definition to include stings in the existing functions and make them have the same semantics as atoms.

greg-rychlewski mentioned this pull request Feb 24, 2024

allow string names in field/2 elixir-ecto/ecto_sql#596

Open

greg-rychlewski added 4 commits February 27, 2024 13:44

allow string field names

b257434

add test

4bbc837

inspect

5a73206

review comment

4876779

greg-rychlewski force-pushed the string_field_name branch from afa4369 to 4876779 Compare February 27, 2024 18:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow strings in `field/2` #4384

Allow strings in `field/2` #4384

greg-rychlewski commented Feb 24, 2024

josevalim commented Feb 24, 2024

greg-rychlewski commented Feb 24, 2024 •

edited

Loading

josevalim commented Feb 24, 2024

josevalim commented Feb 24, 2024

greg-rychlewski commented Feb 24, 2024

josevalim commented Feb 24, 2024

greg-rychlewski commented Feb 24, 2024

josevalim commented Feb 24, 2024

greg-rychlewski commented Feb 24, 2024

josevalim commented Feb 24, 2024

greg-rychlewski commented Feb 25, 2024

josevalim commented Feb 25, 2024

greg-rychlewski commented Feb 27, 2024 •

edited

Loading

greg-rychlewski commented Feb 27, 2024

josevalim commented Feb 28, 2024

greg-rychlewski commented Feb 28, 2024

josevalim commented Feb 28, 2024 •

edited

Loading

greg-rychlewski commented Feb 28, 2024

greg-rychlewski commented Feb 28, 2024

josevalim commented Feb 28, 2024

josevalim commented Feb 28, 2024

josevalim commented Feb 28, 2024

greg-rychlewski commented Feb 28, 2024

Allow strings in field/2 #4384

Are you sure you want to change the base?

Allow strings in field/2 #4384

Conversation

greg-rychlewski commented Feb 24, 2024

josevalim commented Feb 24, 2024

greg-rychlewski commented Feb 24, 2024 • edited Loading

josevalim commented Feb 24, 2024

josevalim commented Feb 24, 2024

greg-rychlewski commented Feb 24, 2024

josevalim commented Feb 24, 2024

greg-rychlewski commented Feb 24, 2024

josevalim commented Feb 24, 2024

greg-rychlewski commented Feb 24, 2024

josevalim commented Feb 24, 2024

greg-rychlewski commented Feb 25, 2024

josevalim commented Feb 25, 2024

greg-rychlewski commented Feb 27, 2024 • edited Loading

greg-rychlewski commented Feb 27, 2024

josevalim commented Feb 28, 2024

greg-rychlewski commented Feb 28, 2024

josevalim commented Feb 28, 2024 • edited Loading

greg-rychlewski commented Feb 28, 2024

greg-rychlewski commented Feb 28, 2024

josevalim commented Feb 28, 2024

josevalim commented Feb 28, 2024

josevalim commented Feb 28, 2024

greg-rychlewski commented Feb 28, 2024

Allow strings in `field/2` #4384

Allow strings in `field/2` #4384

greg-rychlewski commented Feb 24, 2024 •

edited

Loading

greg-rychlewski commented Feb 27, 2024 •

edited

Loading

josevalim commented Feb 28, 2024 •

edited

Loading