Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolves #3013: Support UDF #2995

Open
wants to merge 57 commits into
base: main
Choose a base branch
from
Open

Conversation

pengpeng-lu
Copy link
Contributor

@pengpeng-lu pengpeng-lu commented Dec 5, 2024

No description provided.

This reverts commit 3a5a44b.
@@ -38,7 +38,6 @@
import com.apple.foundationdb.record.metadata.expressions.InvertibleFunctionKeyExpression;
import com.apple.foundationdb.record.metadata.expressions.TupleFieldsHelper;
import com.apple.foundationdb.record.planprotos.PComparison;
import com.apple.foundationdb.record.planprotos.PComparison.PComparisonType;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just changing package in this file

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you just removed the import and then had to add package qualifiers to all the references. Unless I am getting it wrong, please undo as the code was more concise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

@@ -40,7 +40,6 @@
import com.apple.foundationdb.record.planprotos.PIndexKeyValueToPartialRecord.PFieldCopier;
import com.apple.foundationdb.record.planprotos.PIndexKeyValueToPartialRecord.PFieldWithValueCopier;
import com.apple.foundationdb.record.planprotos.PIndexKeyValueToPartialRecord.PMessageCopier;
import com.apple.foundationdb.record.planprotos.PIndexKeyValueToPartialRecord.PTupleSource;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just changing package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Please undo!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -63,7 +63,7 @@ public abstract class BuiltInFunction<T extends Typed> {
* @param parameterTypes The type of the parameter(s).
* @param encapsulationFunction An encapsulation of the function's runtime computation.
*/
protected BuiltInFunction(@Nonnull final String functionName, @Nonnull final List<Type> parameterTypes, @Nonnull final EncapsulationFunction<T> encapsulationFunction) {
public BuiltInFunction(@Nonnull final String functionName, @Nonnull final List<Type> parameterTypes, @Nonnull final EncapsulationFunction<T> encapsulationFunction) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to let other repos use this creator class

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constructor can only be called from subclasses as this class is abstract. Every subclass is allowed to call this constructor regardless if in downstream or in this module. Please undo unless I misunderstood.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

problem solved after MacroFunction introduced.

@@ -21,7 +21,7 @@
package com.apple.foundationdb.record.query.plan.cascades;

import com.apple.foundationdb.record.RecordCoreException;
import com.apple.foundationdb.record.RecordMetaDataProto;
import com.apple.foundationdb.record.RecordKeyExpressionProto;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just changing package.

@@ -20,7 +20,7 @@

package com.apple.foundationdb.record.query.plan.cascades;

import com.apple.foundationdb.record.RecordMetaDataProto;
import com.apple.foundationdb.record.RecordKeyExpressionProto;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just changing package.

@@ -22,6 +22,7 @@

import com.apple.foundationdb.async.RankedSet;
import com.apple.foundationdb.record.RecordCoreException;
import com.apple.foundationdb.record.RecordKeyExpressionProto;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just change package in this folder (metadata)

@@ -20,7 +20,7 @@

package com.apple.foundationdb.record.metadata.expressions;

import com.apple.foundationdb.record.RecordMetaDataProto;
import com.apple.foundationdb.record.RecordKeyExpressionProto;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just changing package in this file

@@ -26,6 +26,7 @@
import com.apple.foundationdb.record.RecordMetaData;
import com.apple.foundationdb.record.RecordMetaDataBuilder;
import com.apple.foundationdb.record.RecordMetaDataOptionsProto;
import com.apple.foundationdb.record.RecordKeyExpressionProto;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just changing package from here on.

@pengpeng-lu pengpeng-lu changed the title Support UDF Resolves #3013: Support UDF Jan 3, 2025
Copy link
Contributor

@normen662 normen662 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I gave a bunch of comments. I am unable to find logic that calls MacroFunctionValue.call() meaning that code is untested on this level. Please write test cases to test the expansion.

@@ -222,6 +229,11 @@ private void loadProtoExceptRecords(@Nonnull RecordMetaDataProto.MetaData metaDa
typeBuilder.setRecordTypeKey(LiteralKeyExpression.fromProtoValue(typeProto.getExplicitKey()));
}
}
PlanSerializationContext serializationContext = new PlanSerializationContext(DefaultPlanSerializationRegistry.INSTANCE,
PlanHashable.CURRENT_FOR_CONTINUATION);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine.

udfMap.put(udf.getFunctionName(), udf);
}

public void addUdfs(@Nonnull Collection<Udf> udfs) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this an Iterable<? extends Udf>. That gives you the most generic type you actually need.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@@ -22,7 +22,7 @@

import com.apple.foundationdb.annotation.API;
import com.apple.foundationdb.annotation.SpotBugsSuppressWarnings;
import com.apple.foundationdb.record.RecordMetaDataProto;
import com.apple.foundationdb.record.RecordKeyExpressionProto;
import com.apple.foundationdb.record.logging.LogMessageKeys;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you ask Alec to review the proto stuff as well?

*
* This source file is part of the FoundationDB open source project
*
* Copyright 2015-2024 Apple Inc. and the FoundationDB project authors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Copyright 2015-2024 Apple Inc. and the FoundationDB project authors
* Copyright 2015-2025 Apple Inc. and the FoundationDB project authors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

PlanSerializationContext serializationContext = new PlanSerializationContext(DefaultPlanSerializationRegistry.INSTANCE,
PlanHashable.CURRENT_FOR_CONTINUATION);
for (RecordMetaDataProto.Udf udf: metaDataProto.getUdfsList()) {
udfMap.put(udf.getFunctionName(), new Udf(udf.getFunctionName(), Value.fromValueProto(serializationContext, udf.getFunctionValue())));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your Udf class should get a fromProto(...) static method that you hand the proto to make an instance of Udf from it. That's more idiomatic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

*
* This source file is part of the FoundationDB open source project
*
* Copyright 2015-2024 Apple Inc. and the FoundationDB project authors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Copyright 2015-2024 Apple Inc. and the FoundationDB project authors
* Copyright 2015-2025 Apple Inc. and the FoundationDB project authors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

public class MacroFunctionValue extends AbstractValue {

@Nonnull
private final List<Value> argList;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The style is to name a list of(unless ambiguous) just the plural of the things the list holds and to add the type of it to the name (in the planner). Also, avoid contractions and abbreviations. So this one should be something like argumentValues

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one becomes "parameterTypes" (same as BuiltInFunction), and "parameterIdentifiers" in the new MacroFunction class.

private final List<Value> argList;

@Nonnull
private final Value body;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one should be bodyValue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@Nonnull
private final Value body;

private MacroFunctionValue(@Nonnull final List<Value> argList, @Nonnull final Value underlying) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make names line up. Also lists are defensively copied into ImmutableList using copyOf() which does not make a copy if the list handed in is already an ImmutableList

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


option java_outer_classname = "RecordQueryRuntimeProto";
option java_multiple_files = true;

message PEnumLightValue {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sure there is a reason, but why did this move back to the plan protos?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because some "PValue" depends on "PComparableObject"
if we keep "PComparableObject " in record_query_runtime.proto, then
record_query_plan depends on record_query_runtime.proto,
record_query_runtime depends on record_metadata.proto,
record_metadata.proto depends on record_query_plan (because Udf uses P
Value)
so there is a circular dependency.

so I moved all stuff related to "PComparableObject" into record_query_plan.proto

* The function takes in a list of arguments as QuantifiedObjectValue, of which alias = uniqueId, resultType = argument type.
* and a body value, executing the function is by replacing the QuantifiedObjectValue in the body with the actual argument.
*/
public class MacroFunctionValue extends AbstractValue {
Copy link
Contributor

@normen662 normen662 Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a comment I wrote after I actually already had submitted the review. Something did not sit right here.

I think you do need to have some sort of helper structure to hold the argument list and the body of the macro, but it does need to be and actually should not be a Value itself. This Value is never a real Value that exists at runtime that you can evaluate, etc. In concept this class is exactly like a BuiltinFunction except:

  1. BuiltinFunctions cannot be serialized/deserialized
  2. Macros like what you have in mind are not builtin.

What can be done with relative ease is to create a class Function that gets everything that is currently in BuiltinFunction. Then make an empty BuiltinFunction that extends from it and represents the superclass for all actually builtin functions. Then also have a class, let's call it MacroFunction also extend Function. MacroFunction adds protobuf serializability. Notice that the encapsulationFunction in Function is exactly what you named call(...).

MacroFunction is in effect a factory for Values. MacroFunction is the thing that you need to keep in the udf map you have in the metadata. Since MacroFunction still needs to know the body and that's a Value all of the refactoring you did for the protos is still very much needed.

@foundationdb-ci
Copy link
Contributor

Result of fdb-record-layer-pr on Linux CentOS 7

  • Commit ID: d058c09
  • Duration 0:12:33
  • Result: ❌ FAILED
  • Error: Error while executing command: ./gradlew --no-daemon --console=plain -b ./build.gradle build destructiveTest -PcoreNotStrict -PreleaseBuild=false -PpublishBuild=false -PspotbugsEnableHtmlReport. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of fdb-record-layer-pr on Linux CentOS 7

  • Commit ID: e11b315
  • Duration 0:13:01
  • Result: ❌ FAILED
  • Error: Error while executing command: ./gradlew --no-daemon --console=plain -b ./build.gradle build destructiveTest -PcoreNotStrict -PreleaseBuild=false -PpublishBuild=false -PspotbugsEnableHtmlReport. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@pengpeng-lu pengpeng-lu requested a review from normen662 January 13, 2025 07:48
@foundationdb-ci
Copy link
Contributor

Result of fdb-record-layer-pr on Linux CentOS 7

  • Commit ID: 29b9cb2
  • Duration 0:46:10
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

Copy link
Contributor

@alecgrieser alecgrieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I focused on the protobuf changes in this review, and they mostly look good to me, with a few suggestions. I could take a closer look at the rest of the PR if desired

*/
syntax = "proto2";

package com.apple.foundationdb.record;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, there's a bit of a decision to be made here. By leaving the package as com.apple.foundationdb.record, that means that this file shares a package with record_metadata.proto (etc). That might not be a bad thing, but I could also see us wanting to update this to something unique like com.apple.foundationdb.record.expressions. I don't have too strong an opinion actually, and maybe leaving it scoped to the same level as the other files is fine. It certainly decreases the diff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I think it makes sense to use package com.apple.foundationdb.record.expressions, I updated the PR. It doesn't increase more diff luckily :)

@@ -97,99 +99,12 @@ message MetaData {
optional bool uses_subspace_key_counter = 11;
repeated JoinedRecordType joined_record_types = 12;
repeated UnnestedRecordType unnested_record_types = 13;
repeated Udf udfs = 14;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to think a little bit about how we'd evolve this forward in the future. The feature implemented here is for scalar valued functions, which works pretty well with our Value abstraction. If we wanted to implement table-valued functions I think we'd need slightly different machinery, and I'm trying to think if we'd want to have one "UDF" message that had a oneof over the different function types, or if we'd want to have different ScalarValuedFunction and TableValuedFunction messages at this level.

My inclination is to think that you'd want them to be separate, which makes me think that we'd want to call the Udf message here something like ScalarValuedFunction in anticipation of some future function types

Copy link
Contributor Author

@pengpeng-lu pengpeng-lu Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point! I agree, I changed the Udf to be ScalarValuedFunction.

}
}

message PMacroFunctionValue {
optional string function_name = 1;
repeated PValue arguments = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a repeated string instead of a repeated PValue? Does using a PValue here mean we record type signature information, and that's what we want to preserve?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arguments are preserved as QuantifiedObjectValue, which holds a Type resultType and a unique identifier (a uuid).

@foundationdb-ci
Copy link
Contributor

Result of fdb-record-layer-pr on Linux CentOS 7

  • Commit ID: 52b3ba4
  • Duration 0:47:56
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of fdb-record-layer-pr on Linux CentOS 7

  • Commit ID: 91e6117
  • Duration 0:07:51
  • Result: ❌ FAILED
  • Error: Error while executing command: ./gradlew --no-daemon --console=plain -b ./build.gradle build destructiveTest -PcoreNotStrict -PreleaseBuild=false -PpublishBuild=false -PspotbugsEnableHtmlReport. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of fdb-record-layer-pr on Linux CentOS 7

  • Commit ID: 262f03b
  • Duration 0:47:16
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants