Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add visitor for converting kernel expressions to engine expressions #363

Merged
merged 87 commits into from
Oct 28, 2024

Conversation

OussamaSaoudi-db
Copy link
Collaborator

@OussamaSaoudi-db OussamaSaoudi-db commented Sep 27, 2024

Rationale for the change

Engines may need to apply expressions that originate in the kernel. This can include filter predicates such as filters for protocol and metadata actions in a log file. Another example is logical to physical transformations that the engine must apply, for instance to materialize partition columns. This PR introduces a visitor framework for kernel expressions to be converted to engine expressions so that engines can apply their own evaluators, predicate pushdowns, and optimizations on these expressions.

Closes: #358

What changes are included in this PR?

5 major changes are made in this PR:

  1. A new module ffi/src/expressions/kernel.rs is added. This module is responsible for converting kernel Expressions to the engine's representation of expressions through an ffi-compatible visitor. The visitor supports binary operations, scalars, struct expressions, variadic and unary operators, and columns. All the engine has to do is implement an EngineExpressionVisitor to support the conversion.
  2. The EnginePredicate to kernel Expressionvisitor code is moved to ffi/src/expressions/engine.rs. This is done to keep all ffi expression code under the ffi/src/expressions module.
  3. This PR also allows engines to get SharedExpressions, which are handles to the kernel's Expression type. These are given back to the kernel to specify which expression the engine would like to visit.
  4. A new ffi example is added to showcase how the EngineExpressionVisitor can be used to construct expressions. This is a C implementation that can be found in ffi/examples/visit-expression/. The example only prints an expression that it receives from the kernel and asserts that it matches an expected output.
  5. A new testing feature flag test-ffi is added to the ffi crate. This feature flag is used to expose testing functions found in ffi/src/test_ffi. This module is currently used to return a kernel expression get_testing_kernel_expression used in the example c implementation.

Are these changes tested?

Two tests are added:

  • I introduced a new test to validate the expression visitor functions correctly and converts data as expected. This simply prints out the expression tree structure and compares it to an expected result. The kernel expression can be found in ffi/src/test_ffi.rs, and the expected output is in ffi/tests/test-expression-visitor/expected.txt
  • There is also a test that checks for memory leaks in the expression visitor using valgrind.

Are there any user-facing changes?

This PR introduces a public facing API for constructing expressions and visiting them. It also adds delta_kernel_derive as a public export of kernel crate. This PR does not break existing APIs.

Copy link

codecov bot commented Sep 27, 2024

Codecov Report

Attention: Patch coverage is 2.07469% with 236 lines in your changes missing coverage. Please review.

Project coverage is 77.25%. Comparing base (f5d0a42) to head (60049ab).

Files with missing lines Patch % Lines
ffi/src/expressions/kernel.rs 0.00% 160 Missing ⚠️
ffi/src/test_ffi.rs 0.00% 76 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #363      +/-   ##
==========================================
- Coverage   78.94%   77.25%   -1.70%     
==========================================
  Files          52       54       +2     
  Lines       10756    10994     +238     
  Branches    10756    10994     +238     
==========================================
+ Hits         8491     8493       +2     
- Misses       1796     2032     +236     
  Partials      469      469              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ffi/src/expressions.rs Outdated Show resolved Hide resolved
@github-actions github-actions bot added the breaking-change Change that will require a version bump label Oct 9, 2024
@OussamaSaoudi-db OussamaSaoudi-db changed the title [WIP] Add visitor for converting kernel expressions to engine expressions Add visitor for converting kernel expressions to engine expressions Oct 10, 2024
Copy link
Collaborator

@nicklan nicklan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Pretty much there, just a couple of small things.

}
call!(visitor, visit_array_literal, sibling_list_id, child_list_id);
}
fn visit_struct_literal(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hrmm, this is a tricky one. I don't love converting the field names into string scalars and maintaining two lists.

We already have a way of doing iterators, so what if we visit all the values, and then visit_struct_literal could take an iterator over KernelStringSlice that will return field names, and the list id where the engine can find the value for each field.

I don't love that solution either, but it feels more clean than the engine having to unwrap all these string scalars into just strings. The downside is that you're pairing an iterator to a list, so the engine has to deal with potentially two different ways of iterating.

I realize this might all change since we're potentially doing away with/seriously changing StructData, so if you think it's not worth tackling until then I'm okay with this in the short term.

also aside: can we rename this internal function so it doesn't have the same name as the visitor one, it's confusing :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was intended as a quick placeholder from the beginning because StructField would've been better to pass. On top of that we're planning a major rework of scalars/structs/arrays, so I think this is okay as a first draft.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed kernel visitor functions to visit_expression_* 👍

kernel/src/expressions/scalars.rs Show resolved Hide resolved
ffi/tests/test_expression_visitor/run_test.sh Outdated Show resolved Hide resolved
@nicklan
Copy link
Collaborator

nicklan commented Oct 23, 2024

Also, I think the coverage is listed as so low here because the testing happens in c. Could we update our codecov runner to somehow also capture that coverage? cc @zachschuermann

Fine to take that as a follow-up. Please create an issue if that's the plan.

@zachschuermann
Copy link
Collaborator

Also, I think the coverage is listed as so low here because the testing happens in c. Could we update our codecov runner to somehow also capture that coverage? cc @zachschuermann

Fine to take that as a follow-up. Please create an issue if that's the plan.

Yup looks like it - created #425

Copy link
Collaborator

@zachschuermann zachschuermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok whew that was a lot haha - haven't reviewed C in a while.. LGTM but left a few nits/follow-ups/etc.

ffi/src/test_ffi.rs Show resolved Hide resolved
run: |
pushd ffi/examples/read-table
mkdir build
pushd build
cmake ..
make
make test
- name: build and run visit-expression test
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this run in parallel with the first (low prio)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, this is sequential.

ffi/src/expressions/kernel_expr_visitor.rs Outdated Show resolved Hide resolved
ffi/examples/visit-expression/CMakeLists.txt Outdated Show resolved Hide resolved
ffi/examples/visit-expression/expression.h Outdated Show resolved Hide resolved
ffi/examples/visit-expression/expression.h Outdated Show resolved Hide resolved
}

void free_expression_list(ExpressionItemList list);
void free_expression_item(ExpressionItem ref) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember you said you were battling memory leak - did you valgrind to validate it now? Wonder if we should include valgrind tests somehow (maybe just do fastest/easiest thing for now - like docs that say "I ran valgrind with blah blah)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I considered that. There's some platform stuff that I'd have to do I think. For example, macos doesn't have valgrind, it uses leaks. Linux has valgrind. I'll give it a shot to run it through valgrind and leaks depending on os.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: Added a linux-only valgrind test. Macos (and i think windows) don't have valgrind, and I didn't think the cross platform test is necessary.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea just linux I think would certainly suffice :)

ffi/src/expressions/kernel_expr_visitor.rs Outdated Show resolved Hide resolved
ffi/src/expressions/kernel_expr_visitor.rs Outdated Show resolved Hide resolved
Comment on lines +36 to +37
#[cfg(feature = "test-ffi")]
pub mod test_ffi;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also instead of making new feature flag we could just put behind usual test cfg?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into this. There's no easy way to do this as far as I know. cbindgen ignores all code under #[cfg(test)].

One idea is to have a separate build.rs that's exclusively for testing. This can include a test crate using the bulider.

Either way we'll have to rethink how we handle our current ffi tests imo

@zachschuermann
Copy link
Collaborator

Also @OussamaSaoudi-db can we update the PR description? There are a number of changes here and I think it would help if we have a PR desc that says something to the effect of a list (1) moves old expression visitor to engine_expr_visitor (2) adds kernel_expr_visitor mod which defines EngineExpressionVisitor (3) new expression visitor example implementation in C in ffi/examples/visit-expression/ (4) etc. etc.

Copy link
Collaborator

@scovich scovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking really good. A few items to address before merge.

ffi/examples/read-table/expression.h Outdated Show resolved Hide resolved
kernel/src/schema.rs Outdated Show resolved Hide resolved
ffi/src/test_ffi.rs Outdated Show resolved Hide resolved
@@ -20,7 +20,8 @@ pub struct ArrayData {
}

impl ArrayData {
pub fn new(tpe: ArrayType, elements: Vec<Scalar>) -> Self {
pub fn new(tpe: ArrayType, elements: impl IntoIterator<Item = impl Into<Scalar>>) -> Self {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking about this idiom, and how it gets annoying for passing empty lists (see above, where vec![] no longer compiles and we have to instead do Vec::<Scalar>::new().

I couldn't find a way to infer a "reasonable" type from [] or vec![], so the only other option is to define an empty method. But that's not great, either.

In the end, it may not matter much, because these new-empty cases are almost entirely confined to test code. But on the other hand most expression-using code is actually test code, and there's an argument for making tests easier to write and maintain.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's definitely annoying 😔 I personally want to see macros to easily put together expressions, struct data, etc. Perhaps such a macro could handle this case. but we're a long way from that point.

ffi/src/test_ffi.rs Outdated Show resolved Hide resolved
/// ([`usize`]) the engine wants, and will be passed back to the engine to identify the list in the
/// future.
///
/// Every expression the kernel visits belongs to some list of "sibling" elements. The schema
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aside: "sibling" has always seemed a confusing/unhelpful name to me (which is bad, given that I named it). Maybe it's better termed an "owner" or "parent" list? (as in, the list that owns me -- different from the child list that I own)?

Open to other ideas, and certainly doesn't need to be addressed in this PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it took me a while to understand it. I think parent and owner might help.

Bit of a curiosity question: Any prior work on this style of visitor pattern? I've seen examples of visitor pattern where we simply get back the item. Here we're providing a list to put the item into.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not seen this "double trampoline" pattern previously. It is motivated by kernel's particular constraints when it comes to memory management (engine and kernel have separate memory and they don't mix nicely).

ffi/src/expressions/kernel.rs Outdated Show resolved Hide resolved
ffi/src/expressions/kernel.rs Outdated Show resolved Hide resolved
ffi/src/test_ffi.rs Outdated Show resolved Hide resolved
@@ -75,6 +75,7 @@ pub mod snapshot;
pub mod table;
pub(crate) mod utils;

pub use delta_kernel_derive;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scovich I wanted to flag this. I had to do this to do this because the column_name macro (used by column_expr) uses the macro from delta_kernel_derive.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems ok? But I'll admit I don't know the implications of making that crate public?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a separate crate so I think we can tackle this two ways?

  1. (this approach) we can export it via kernel since it's a dependency of kernel or
  2. we can just import it as a separate crate into FFI crate right?

seems like better separation of concerns to just to (2) but since kernel will always depend on delta_kernel_derive, I could also be okay with option (1).

curious to hear from @nicklan

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems ok? But I'll admit I don't know the implications of making that crate public?

AFAIK the crate is already public, and this is just re-exporting it as a public sub-crate from kernel.

Copy link
Collaborator Author

@OussamaSaoudi-db OussamaSaoudi-db Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Laying out the 2 options was rly helpful!

It feels like it also makes sense to do 1 from a usability standpoint. Suppose delta-rs wants to create columns using column_expr! or column_name!. Without the re-export, they'd get a compilation error coming from the macro, and they'd have to go through a couple layers of macro before finding why it fails. This was my experience using the macros

@OussamaSaoudi-db
Copy link
Collaborator Author

Posting for some visibility: We only do memory leak testing for linux. This is because macos does not support valgrind. It has leaks, but it somehow doesn't play nicely with github actions and times out. Similar problem encountered by others.

I think it's sufficient to just test memleaks on linux.

target_compile_options(visit_expression PRIVATE -Wall -Wextra -Wpedantic -Werror -Wno-strict-prototypes)

# Get info on the OS and platform
if (${CMAKE_SYSTEM_NAME} MATCHES "Darwin")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zachschuermann just wanted to have a second look. Slightly yucky, but it's the way I found to target linux for valgrind. Note that macos does not support valgrind.

@scovich
Copy link
Collaborator

scovich commented Oct 28, 2024

Posting for some visibility: We only do memory leak testing for linux. This is because macos does not support valgrind. It has leaks, but it somehow doesn't play nicely with github actions and times out. Similar problem encountered by others.

I think it's sufficient to just test memleaks on linux.

That seems reasonable -- I don't see why leak behavior should be platform-dependent. A leak is a leak is a leak, most likely.

Copy link
Collaborator

@nicklan nicklan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! thanks

@OussamaSaoudi-db OussamaSaoudi-db merged commit 5e5c299 into delta-io:main Oct 28, 2024
17 checks passed
@OussamaSaoudi-db OussamaSaoudi-db deleted the expresions_visitor2 branch October 28, 2024 19:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change Change that will require a version bump
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add visitor for converting kernel expressions to engine expressions
4 participants