Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unit test optimization toggles from LazyFrame #405

Merged
merged 13 commits into from
Oct 10, 2023
Merged

Conversation

Sicheng-Pan
Copy link
Collaborator

#325

Update nix flake and write the unit test for LazyFrame optimization toggles.

@Sicheng-Pan
Copy link
Collaborator Author

Currently the unit test is doing nothing because there is no way to extract optimization toggles after setting them in functions like collect

@sorhawell
Copy link
Collaborator

Hi @Sicheng-Pan thx for looking into this :)

how about setting optstate on a lazyframe with internal functions and then check if actually set with this new getter ?

@Sicheng-Pan
Copy link
Collaborator Author

Hi @Sicheng-Pan thx for looking into this :)

how about setting optstate on a lazyframe with internal functions and then check if actually set with this new getter ?

@sorhawell Sounds good to me. I've already exported this getter function to R and I will write an internal setter as well. May be we could also make the setter public and allow the users to keep existing optimization settings in functions like collect?

@sorhawell
Copy link
Collaborator

How about having some global option defining if optstate should be set via an external setter or vi collect etc? Then the option could be used in collect.

We should let default behavior match py-polars when there is no clear need otherwise.

Some edge casesnto consider:
What if opt state params is set both by external setter and via collect?

What if two lazyframes are joined, what shouldhappen to optstate?

How does collect know if a lazyframe settings should not be altered, because a user has already set them?

@Sicheng-Pan Sicheng-Pan marked this pull request as ready for review October 2, 2023 18:26
Copy link
Collaborator

@sorhawell sorhawell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a handful of bugs and error handling in flow in LazyFrame__lazy.R
there could be more unit tests to tease out these errors

@@ -766,3 +766,18 @@ test_that("unnest", {
to_data_frame()
)
})

test_that("opt_toggles", {
lf = pl$LazyFrame(mtcars)$select(pl$col("mpg") * 0.42)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try add some more unit tests for sink_parquet etc to confirm no errors

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a few more tests.

self |>
.pr$LazyFrame$optimization_toggle(
if (isTRUE(inherit_optimization)) {
self
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this self is not being fetched

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

if (isTRUE(inherit_optimization)) {
self
} else {
self$set_optimization_toggle(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any error in set_optimization_toggle will be raised and escaped the final unwrap

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider using .pr$LazyFrame$set_opt... or wrap entire block in result() to catch any errors

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx I will review it soon

@sorhawell sorhawell self-assigned this Oct 9, 2023
Copy link
Collaborator

@sorhawell sorhawell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks @Sicheng-Pan !

@sorhawell sorhawell merged commit b64a700 into main Oct 10, 2023
11 checks passed
@sorhawell sorhawell deleted the test_opt_toggle branch October 10, 2023 22:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants