Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert dt.isna() to FExpr #3444

Open
wants to merge 54 commits into
base: main
Choose a base branch
from
Open

Convert dt.isna() to FExpr #3444

wants to merge 54 commits into from

Conversation

samukweku
Copy link
Contributor

@samukweku samukweku commented Mar 13, 2023

WIP for #2562

@samukweku samukweku self-assigned this Mar 13, 2023
docs/api/math/isna.rst Outdated Show resolved Hide resolved
docs/api/math/isna.rst Outdated Show resolved Hide resolved
docs/api/math/isna.rst Outdated Show resolved Hide resolved
src/core/expr/funary/isna/fexpr_isna.cc Outdated Show resolved Hide resolved
src/core/expr/funary/isna/fexpr_isna.cc Outdated Show resolved Hide resolved
src/core/expr/funary/isna/fexpr_isna.cc Outdated Show resolved Hide resolved
src/datatable/math.py Outdated Show resolved Hide resolved
src/datatable/math.py Outdated Show resolved Hide resolved
tests/test-f.py Outdated Show resolved Hide resolved
docs/api/fexpr.rst Outdated Show resolved Hide resolved
@oleksiyskononenko
Copy link
Contributor

The reason this build is failing may have nothing to do with this PR

[2023-03-15T18:45:15.448Z] Cancelling nested steps due to timeout

[Pipeline] // node

[Pipeline] }

[2023-03-15T18:45:15.469Z] Failed in branch Build on x86_64-macos

When you push more updates, the build will be re-triggered and we will see if this was just a temp glitch on Jenkins.

docs/api/math/isna.rst Outdated Show resolved Hide resolved
tests/test-f.py Outdated Show resolved Hide resolved
src/core/expr/fbinary/fexpr__eq__.cc Outdated Show resolved Hide resolved
Comment on lines 58 to 60
out += ", reverse=";
out += reverse_? "True" : "False";
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this indent?

Copy link
Contributor Author

@samukweku samukweku Apr 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this section is under the else section, hence the indent so the } aligns under else. I have restored it to the previous form, however, i thought that was the idea behind indent, so any reader (me inclusive) can better understand the code (which section refers to if, else, etc, readability).

the if statement is indented, with the for loop shifted two spaces, so I only continued the form. As always, open to feedback on why this shouldnt be so. thanks @oleksiyskononenko

Comment on lines 107 to 111
for (size_t i = i1; i < i2; ++i) {
is_valid = col.get_element(i, &value);
fill_id = is_valid? i : fill_id;
indices[i] = static_cast<int32_t>(fill_id);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here as well, aligning the } bracket under the for

}

};
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strange indent too... Is it possible to set up your editor, so that it doesn't introduce these indentation changes?

src/core/column/isna.h Outdated Show resolved Hide resolved
src/core/expr/fnary/rowcount.cc Outdated Show resolved Hide resolved
src/core/expr/funary/fexpr_isna.cc Outdated Show resolved Hide resolved
src/core/expr/fexpr_fillna.cc Outdated Show resolved Hide resolved
src/core/expr/fexpr_fillna.cc Outdated Show resolved Hide resolved
src/core/column/isna.h Outdated Show resolved Hide resolved
Comment on lines 44 to 60
static Column make_isna_col(Column&& col) {
switch (col.stype()) {
case SType::VOID: return Const_ColumnImpl::make_bool_column(col.nrows(), true);
case SType::BOOL:
case SType::INT8: return Column(new Isna_ColumnImpl<int8_t>(std::move(col)));
case SType::INT16: return Column(new Isna_ColumnImpl<int16_t>(std::move(col)));
case SType::DATE32:
case SType::INT32: return Column(new Isna_ColumnImpl<int32_t>(std::move(col)));
case SType::TIME64:
case SType::INT64: return Column(new Isna_ColumnImpl<int64_t>(std::move(col)));
case SType::FLOAT32: return Column(new Isna_ColumnImpl<float>(std::move(col)));
case SType::FLOAT64: return Column(new Isna_ColumnImpl<double>(std::move(col)));
case SType::STR32:
case SType::STR64: return Column(new Isna_ColumnImpl<CString>(std::move(col)));
default: throw RuntimeError();
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want to repeat the same code here and in rowcount.cc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do I go about making it generic? is there a directory to store such generic function?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have proposed Isna_ColumnImpl::make_isna_column(Column&& col) here: #3444 (comment) Seems like a relevant place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh ... i missed it ... lemme have a look

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll defer to U on this one @oleksiyskononenko I don't know how to implement a static method in the isna.h file and still pass the template. How would you write it?

Copy link
Contributor Author

@samukweku samukweku Apr 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oleksiyskononenko I made a change to isna.h ... does that suffice? open to see how it can be improved, or done in the right way

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's good for the moment. I will review the PR once it is green.

@oleksiyskononenko
Copy link
Contributor

oleksiyskononenko commented Apr 27, 2023

@samukweku As a part of this PR, let's not deprecate dt.isna() because it seems that it was deprecated by mistake. For the moment, let's keep both dt.isna() and dt.math.isna(). Hence, the corresponding f-method could be simply called .isna().

@oleksiyskononenko
Copy link
Contributor

Thanks for the changes, is this PR now ready for a review?

@samukweku
Copy link
Contributor Author

@oleksiyskononenko yet it is ready for a review

docs/api/frame/replace.rst Outdated Show resolved Hide resolved
docs/api/dt/isna.rst Outdated Show resolved Hide resolved
docs/api/fexpr.rst Outdated Show resolved Hide resolved
docs/api/fexpr/isna.rst Outdated Show resolved Hide resolved
docs/api/math/isna.rst Outdated Show resolved Hide resolved
tests/math/test-isna.py Outdated Show resolved Hide resolved
@@ -74,8 +74,10 @@ def test_isna_joined():
dt.internal.frame_integrity_check(RES)
assert RES.to_list() == [[True, True, False, True, False]] * 4


@pytest.mark.xfail(reason="scalar NA check not supported")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, was it ever supported? If yes, we probably need to restore it. I didn't notice we had this test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I cant see the python fallback for this though

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we need to restore this functionality, who knows may be it is used by some projects.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it seems all the unary functions can take a python scalar as the argument: https://github.com/h2oai/datatable/blob/main/src/core/expr/funary/pyfn.cc#L66-L93

We need to see how to implement that with the new FExprs.

tests/munging/test-delete.py Outdated Show resolved Hide resolved
tests/munging/test-dt-rows.py Outdated Show resolved Hide resolved
tests/test-import-all.py Outdated Show resolved Hide resolved
@oleksiyskononenko
Copy link
Contributor

So I propose we keep dt.isna() as a main function, and for others (dt.math.isna() and f.isna()) say they are "the same as dt.isna()".

@oleksiyskononenko
Copy link
Contributor

Not sure how you merge main, but it seems that you're just adding more commits on top of this branch. What you need to do is

git checkout main
git pull
git checkout samukweku/fexpr_isna
git merge main

This will create only one merge commit incorporating all the changes from main.

@samukweku
Copy link
Contributor Author

i'm using the code from the guide:

git checkout main
git fetch upstream
git merge upstream/main
git checkout samukweku/fexpr_isna
git merge main

@oleksiyskononenko
Copy link
Contributor

I guess the guide explains how to start a new branch and not how to merge existing changes onto the existing branch.

@samukweku
Copy link
Contributor Author

@oleksiyskononenko I have made the changes. Do let me know if there are other parts that need fixing. thanks

@oleksiyskononenko oleksiyskononenko changed the title [ENH] Implement FExpr for ISNA Convert dt.isna() to FExpr May 4, 2023
src/core/expr/fnary/rowcount.cc Show resolved Hide resolved
src/core/column/isna.h Outdated Show resolved Hide resolved
@@ -74,8 +74,10 @@ def test_isna_joined():
dt.internal.frame_integrity_check(RES)
assert RES.to_list() == [[True, True, False, True, False]] * 4


@pytest.mark.xfail(reason="scalar NA check not supported")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it seems all the unary functions can take a python scalar as the argument: https://github.com/h2oai/datatable/blob/main/src/core/expr/funary/pyfn.cc#L66-L93

We need to see how to implement that with the new FExprs.

@samukweku
Copy link
Contributor Author

@oleksiyskononenko made edits to cover for non FExprs for isna. kindly have a look when you can.

@samukweku
Copy link
Contributor Author

@oleksiyskononenko just checking in on your feedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants