-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improwe workflows with filtered DataFrame #2354
Comments
Now given #2211 (comment) and #2211 (comment) my question is if we should not also add
This is essentially as adding
reads better than:
we do not save typing here, but maybe you will find it more readable? |
I think I would prefer a kwarg More generally, I think that the decision (and also the default of the kwarg option) should be taken in conjonction with the discussion for transform/select etc... #2314 |
A good point. But I would not alter the defaults (sorry for this position here but I do want to avoid breaking changes of this kind till 1.0). |
The only issue is that |
Yes. Personally, I prefer an argument that always have the same name (skipmissing), even if it has slightly different meanings in different contexts, rather than a different keyword argument everytime. I am not a fan on how R has na.rm, na.exclude, na.omit etc. |
(ok - and I would appreciate a comment in #2314 what you think should be the results on the cases I have listed there 😄) |
I was thinking about these things and I would close #2211, #2314 and #2258 in favour of this issue which I would aim to solve all the issues raised in one consistent design. So my proposal would be the following:
So the only thing that does not go as you want is "sticky behavior", which I think we should have because of two reasons:
The decision to be made is if we want automatic column promotion for such A side benefit is that it will be much easier to implement it than doing all new stuff for Now how it addresses the issues I mention:
What do you think? |
Makes sense. But I'm not sure it would really fix #2258 and #2314, as repeating |
I agree with @nalimilan. I think I have been confusing two distinct issues in my comments, which I apologize for.
|
Ah - OK. So for #2258 and #2314 it would be a "partial fix" (via a more general mechanism, which has it as a special case, but I agree we can think of more convenient patterns). So do you think what I propose is OK for resolving #2211 (and having a partial solution for the other issues does not hurt, but of course let us discuss more convenient options)? |
If the goal is to be able to make it easier to replace certain rows of a column if a condition is satisfied, I am not sure it is worth it. That is because I do not think that the pattern parent(transform!(filter(:x => >(1), df, view = true), :x => 1))) is simpler than transform!(df, :x => x-> ifelse.(x .>= 1, 1, x)) |
Yes, but all that was asked for in #2211 can be handled by @pdeffebach - given the discussion we have what use cases of #2211 do you see? (even if we do not add #2211 functionality, still we can add |
I agree with you both. i don't think this is that important. I do like the way stata reads Therefore I agree that we should continue thinking about better ways to make |
OK - so do we close #2211, or keep it open to get back to it in the future? (for me it is easier not to have #2211 as it will complicate codebase because it requires special handling in many functions, but if you feel we can go back to it some day then let us keep it open)
Agreed - and thank you for spending your time on this. |
Let's close it if it's easiest. I think DataFramesMeta can handle a lot of it's points. |
Now thinking of it I realized that the issue also is that Also
|
Yes that is one problem. I've had that before and been frustrated by it. We really need a lazy version of this in Base to be honest... Another issue is that that "generate a standardized income index, among women, for all women" is tough.
The Then again, I also am not sure what
does in stata. I don't know how the variables are subsetted. |
Right, I did not think about that . The ifelse pattern can only be used in transform for functions that are elementwise, not functions that compute reductions. |
But ultimately I agree that this kind of ifelse pattern could be handled by a macro or some sort of struct with lazy broadcasting. It's a tool that has general convenience beyond dataframes and thus doesn't have to live here. |
Yes, but it should not be |
Given the decision in #2314 do we need the functionality described in #2211 or not? Or maybe it is enough if we have this functionality for (i.e. do we need mutating |
This has been discussed in several places I create a separate issue for this to keep track of it as it is an important functionality I think. What we want is
filter(predicate, df, view=true)
to return a view (so then we can conveniently update this view for example).The text was updated successfully, but these errors were encountered: