-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
De-functionalize query internals #989
Conversation
This just makes the intended use clearer.
This makes it clearer it's a boolean.
This is in line with what we do elsewhere.
This is the primary change. Now `query/2` takes the dataframe as an explicit argument intead of an implicit, unhygienized variable.
There is another concern here related to exposing Then we change If so, I'd do the following changes:
|
@josevalim I think you're on the right track. Let me clarify my goal. I want to make functionality like Polars expressions more 1st class. I want users to be able to directly create/manipulate an expression-like data structure much like you can do with an We kind of have expressions already in the form of I like the idea of a Here's my understanding of the concepts at play:
All that said, I'm pretty open to other suggestions on how to make this work! As I found out in the other PR, it's a bit of a tricky needle to thread. |
I think the table is missing one entry, which is that we need a QueryFrame that, when accessed, returns LazySeries. That’s different from a lazy frame (the result of DF.to_lazy) |
Ok thanks this is good food for thought. I think I need to play with a few options. I want to make sure Once I have something I'll write it up. I'll also maybe move this to an issue. I'm realizing I'm still too much in the designing phase. |
I think this PR is almost there. To get started, you could:
And if you change nothing else, it should be what you want. Everything else I mentioned is cleanup/refactoring. It is just that I tend to think bottom-up. :) |
I did the naive thing but then I realized that this is now possible: alias Explorer.{DataFrame, Query, Series}
df1 = DataFrame.new(a: [1, 2, 3])
df2 = DataFrame.new(b: [4, 5, 6])
qf1 = Query.new(df1)
qf2 = Query.new(df2)
c_lazy = Series.add(qf1["a"], qf2["b"])
DataFrame.mutate_with(df1, c: c_lazy)
I'm not sure this is what we want. We'll probably at least want better error handling. |
Co-authored-by: José Valim <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beautiful! We only need docs and this is good to go IMO!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice improvement! 💯
Sorry for the delay! Notes:
|
* Remove an extra "the" * Reference the rewritten section in the `new/1` docs
Great @billylanchantin! I have added some feedback to the docs. In a nutshell, the implementation detail docs are going for too long and we have no documentation on how and why to use the |
Ok I think that's better. Thanks for the review :) Do we still like the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great. The name QueryFrame is also much better to me. My only remaining question is if we rename to rename LazySeries to QuerySeries but I think they are not equivalent to QueryFrame (as in you can actually use a LazySeries in most series operations, but you can't do that with a QueryFrame).
Yes that's my take too. "Query" is good in |
Co-authored-by: José Valim <[email protected]>
Description
This is step 1 of implementing the ideas from this PR:
%LazySeries{}
#944From @josevalim:
Before if you wanted to use
filter_with/2
and friends you had to write a callback. You can still do that. But now you can also do:Changes
_with
functions now accept the outputs of their callbacks tooExplorer.Query.new/1
Backend.LazyFrame
asBackend.QueryFrame