Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewriting variables in where and by #1002

Open
Tracked by #967
LeaVerou opened this issue Nov 28, 2023 · 9 comments
Open
Tracked by #967

Rewriting variables in where and by #1002

LeaVerou opened this issue Nov 28, 2023 · 9 comments

Comments

@LeaVerou
Copy link
Member

This is part of #967 and depends on #975.

When we started this effort, I did say that where is a little trickier to rewrite. With everything else we can just extract top-level variables, distinguish globals, Mavo functions, and variables that come from properties, prepend them with globalThis, $fn, or $data accordingly and we're done.

Rewriting where and by involves an additional challenge: When we write something like foo where bar = 3, we don't know until runtime if bar is a property of foo, or a property in that scope. In the functional form of the filter, this would be clear: filter(foo, foo.bar = 3) for the former and filter(foo, bar = 3) for the latter.

Right now this is done by adding a scope() function, which can be used independently, implemented entirely in mavoscript.js as a serializer . To implement that, an overcomplicated workaround is used involving another with, quite possibly the most complicated piece of code in the entire codebase (competing only with the stuff in data.js and primitive.js). It is probably not a good use of your time @adamjanicki2 to try and understand how this code works, since it's written that way to overcome challenges the new approach doesn't have in the first place.

Instead, I think it's better to write an algorithm from scratch, keeping in mind that:

  • While it's common for the first operand to be a plain identifier, it could be a whole expression. We don't want our rewriting to cause it to have to be evaluated multiple times, since it could be a pretty expensive expression.
  • Any (TL) variable in the second operand could be either an implied descendant of the first operand, or a regular variable.
  • $this should not be touched.

Basically, for every top-level variable in the second operand of where that is NOT $this, we want to rewrite it in a way that looks on operand 1 first, then falls back to regular resolution. Essentially foo where bar = 3 would become filter(foo, firstDefined(foo.bar, bar) = 3) where firstDefined() would be something like (...args) => args.find(a => a !== undefined), but that won't work because foo and bar are proxies and never return undefined. So it will need some digging to properly tell these cases apart. Also, we don't want to be evaluating foo twice, especially if it's an expression, so we'd need to assign it to a variable.

Actually, I think we should do this before #976.

  1. It's lower stakes, so it allows us to take vastly out for a spin
  2. It removes a huge wart from the codebase
  3. This rewriting needs to happen before the rewriting in Rewrite expression variables to start with $data. #976 anyway (i.e. the result of the rewrite will still go through the regular rewriting pipeline, and that is something to keep in mind)

While we're at it, we could also fix existing longstanding where bugs like #740

by has similar issues, though I think right now we don't treat it the same way and we just assume that every variable in the 2nd operand is implicitly scoped to the first operand.

It's unclear if we still need a scope() function after this, but it probably wouldn't be too hard to implement if so.

@karger
Copy link
Collaborator

karger commented Nov 29, 2023

If you're beginning rewrites that affect scoping, perhaps it is worth opening the discussion of what the scoping rules for mavo should be? In particular, Mavo's procedure for resolving x.foo when x does not have a foo property is to resolve foo in the scope of x which I think is generally wrong and is certainly a big source of bugs.

@adamjanicki2
Copy link
Collaborator

Yeah taking a closer look at scoping is definitely a good idea. @karger in the example you provided above, what does it mean that mavo tries to resolve foo in the scope of x?

@karger
Copy link
Collaborator

karger commented Nov 29, 2023

@adamjanicki2 it isn't clear it is a good idea because it will be messy and time consuming. but an example can be found here:
https://codepen.io/karger/pen/jOdvZXe?editors=1100

@adamjanicki2
Copy link
Collaborator

Hmm that's interesting, perhaps we can try to address this if we have time after taking care of the main issue which is moving mavo to ESM

@LeaVerou
Copy link
Member Author

If you're beginning rewrites that affect scoping, perhaps it is worth opening the discussion of what the scoping rules for mavo should be? In particular, Mavo's procedure for resolving x.foo when x does not have a foo property is to resolve foo in the scope of x which I think is generally wrong and is certainly a big source of bugs.

I agree, but this will be far easier to resolve after these changes (some of it may even resolve itself simply because its root cause is that in withwe can't tell the difference between foo.bar and "bar evaluated within foo").

@adamjanicki2
Copy link
Collaborator

adamjanicki2 commented Nov 29, 2023

Rewriting where and by involves an additional challenge: When we write something like foo where bar = 3, we don't know until runtime if bar is a property of foo, or a property in that scope. In the functional form of the filter, this would be clear: filter(foo, foo.bar = 3) for the former and filter(foo, bar = 3) for the latter.

What's an example when the second clause of the where is not a property of the first? I read the mavo docs and am still confused. I think what I'm confused about is why, if bar is not a property of foo, it even checks elsewhere for resolution; what's a use case where this would be useful? I'm sure there is one, I just don't see it currently 😅

Nevermind I think I get it; there could be cases where unrelated variables are used in the second clause, for example, this expression: usedLetters where (state = correct and !stateHidden)

My brain is definitely tired today 😆

@LeaVerou
Copy link
Member Author

LeaVerou commented Nov 29, 2023

Rewriting where and by involves an additional challenge: When we write something like foo where bar = 3, we don't know until runtime if bar is a property of foo, or a property in that scope. In the functional form of the filter, this would be clear: filter(foo, foo.bar = 3) for the former and filter(foo, bar = 3) for the latter.

What's an example when the second clause of the where is not a property of the first? I read the mavo docs and am still confused. I think what I'm confused about is why, if bar is not a property of foo, it even checks elsewhere for resolution; what's a use case where this would be useful? I'm sure there is one, I just don't see it currently 😅

Oh it's super common. E.g. consider person where age > 30. Now think that instead of 30 we could have a property so that the end user can customize the filter: person where age > min_age.

@karger
Copy link
Collaborator

karger commented Nov 29, 2023

I frequently end up with really confusing where clauses like item where label=label . I can never remember later which of the two labels is evaluated in the context/scope of item and which refers to some other property to which I'm comparing it. I eveentually figure out that item where item.label=label is clearer.

@adamjanicki2
Copy link
Collaborator

Yeah I think in general where is far more confusing than a filter since there's no ambiguity about whether something is a property of the item being filtered or its own thing, but for a person with limited coding experience, I could see how using a simple where clause would be more intuitive than using filter.

I'm going to think about this tonight and tomorrow before rushing into implementing it; this is certainly a messy thing to fix

@LeaVerou LeaVerou added Status: 4. Ready to implement We know how to implement this, but who will do it? Topic: Formulas labels Jan 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants