Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for emit in aggregate relations #122

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

EpsilonPrime
Copy link
Member

@EpsilonPrime EpsilonPrime commented Sep 12, 2024

This adds support for emit in aggregations where there is no more than one grouping section.

This addresses #121 .

Comment on lines +75 to +79
if (symbol->subtype.type() == typeid(::substrait::proto::Rel::RelTypeCase) &&
ANY_CAST(::substrait::proto::Rel::RelTypeCase, symbol->subtype) ==
::substrait::proto::Rel::kAggregate) {
return true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For pattern matching like this, I recommend adding a conditional cast macro:

#define ANY_CAST_IF(ValueType, value) \
  value.type() != typeid(ValueType) \
  ? ::std::nullopt \
  : ::std::make_optional(ANY_CAST(ValueType, value))

Then you can write

Suggested change
if (symbol->subtype.type() == typeid(::substrait::proto::Rel::RelTypeCase) &&
ANY_CAST(::substrait::proto::Rel::RelTypeCase, symbol->subtype) ==
::substrait::proto::Rel::kAggregate) {
return true;
}
if (auto type_case = ANY_CAST_IF(::substrait::proto::Rel::RelTypeCase, symbol->subtype)) {
return type_case == ::substrait::proto::Rel::kAggregate;
}

If you want to be even fancier, I can provide a match() helper:

bool isAggregate(const SymbolInfo* symbol) {
  return match(symbol->subtype,
    [](::substrait::proto::Rel::RelTypeCase type_case) {
      return type_case == ::substrait::proto::Rel::kAggregate;
    },
    [](RelationType rel_type) {
      return rel_type == RelationType::kAggregate;
    },
    [](...) {
      return false;
    });
}

Comment on lines +227 to +243
if (isAggregate(currentScope) &&
fieldReference < relationData->generatedFieldReferences.size()) {
symbol = relationData->generatedFieldReferences[fieldReference];
} else if (fieldReference < fieldReferencesSize) {
symbol = relationData->fieldReferences[fieldReference];
} else if (
fieldReference <
fieldReferencesSize + relationData->generatedFieldReferences.size()) {
symbol =
relationData
->generatedFieldReferences[fieldReference - fieldReferencesSize];
} else {
errorListener_->addError(
"Encountered field reference out of range: " +
std::to_string(fieldReference));
return "field#" + std::to_string(fieldReference);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems odd to me because I'd expect a clean outer branch on the type of currentScope before we look at the value of fieldReference. If currentScope is an aggregate but fieldReference is out of range, is it really reasonable to use the same code we would have used if currentScope were not an aggregate? I'd expect something like:

Suggested change
if (isAggregate(currentScope) &&
fieldReference < relationData->generatedFieldReferences.size()) {
symbol = relationData->generatedFieldReferences[fieldReference];
} else if (fieldReference < fieldReferencesSize) {
symbol = relationData->fieldReferences[fieldReference];
} else if (
fieldReference <
fieldReferencesSize + relationData->generatedFieldReferences.size()) {
symbol =
relationData
->generatedFieldReferences[fieldReference - fieldReferencesSize];
} else {
errorListener_->addError(
"Encountered field reference out of range: " +
std::to_string(fieldReference));
return "field#" + std::to_string(fieldReference);
}
if (isAggregate(currentScope)) {
if (fieldReference < relationData->generatedFieldReferences.size()) {
symbol = relationData->generatedFieldReferences[fieldReference];
}
} else {
auto size = relationData->fieldReferences.size();
if (fieldReference < size) {
symbol = relationData->fieldReferences[fieldReference];
} else if (
fieldReference <
size + relationData->generatedFieldReferences.size()) {
symbol =
relationData
->generatedFieldReferences[fieldReference - size];
}
}
if (symbol == nullptr) {
errorListener_->addError(
"Encountered field reference out of range: " +
std::to_string(fieldReference));
return "field#" + std::to_string(fieldReference);
}

in any case, please clarify the branching here

// TODO -- Add support for grouping fields (needs text syntax).
errorListener_->addError(
"Asked to emit a field (" + std::to_string(item) +
" beyond what the aggregate produced.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
" beyond what the aggregate produced.");
") beyond what the aggregate produced.");

Comment on lines +376 to +377
return symbol->subtype.type() == typeid(RelationType) &&
ANY_CAST(RelationType, symbol->subtype) == RelationType::kAggregate;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return symbol->subtype.type() == typeid(RelationType) &&
ANY_CAST(RelationType, symbol->subtype) == RelationType::kAggregate;
return ANY_CAST_IF(RelationType, symbol->subtype) == RelationType::kAggregate;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants