Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

null filling and coalescing for mv agg query #6088

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

null filling and coalescing for mv agg query #6088

wants to merge 12 commits into from

Conversation

pjain1
Copy link
Member

@pjain1 pjain1 commented Nov 13, 2024

These PR adds supports for two features-

  1. fill_missing - This is a metrics view aggregation query level config and when a computed time dimension is used, it fills in the missing time buckets in the response.
  2. treat_nulls_as - This is a measure level config used to configure what value to fill in for missing time buckets. This also works generally as COALESCING over non empty time buckets.

Closes https://github.com/rilldata/rill-private-issues/issues/786

@pjain1 pjain1 marked this pull request as draft November 13, 2024 11:11
@nishantmonu51 nishantmonu51 added blocker A release blocker issue that should be resolved before a new release and removed blocker A release blocker issue that should be resolved before a new release labels Nov 15, 2024
@pjain1 pjain1 changed the title null filling for mv agg query on duckdb null filling and coalescing for mv agg query Nov 25, 2024
@pjain1 pjain1 marked this pull request as ready for review November 25, 2024 07:06
@@ -205,6 +205,7 @@ message MetricsViewSpec {
string format_d3 = 7;
google.protobuf.Struct format_d3_locale = 13;
bool valid_percent_of_total = 6;
string treat_nulls_as = 14; // TODO what should the type, using string values will not work when coalescing numeric cols
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just consider this to be a SQL expression to be templated into the query literally. For example:

treat_nulls_as: 0
treat_nulls_as: CAST(0 AS HUGEINT)
treat_nulls_as: "'Not available'"

@@ -83,7 +91,7 @@ func (c *sqlConnection) QueryContext(ctx context.Context, query string, args []d
context.AfterFunc(ctx, func() {
tctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
r, err := http.NewRequestWithContext(tctx, http.MethodDelete, c.dsn+"/"+dr.Context.SQLQueryID, http.NoBody)
r, err := http.NewRequestWithContext(tctx, http.MethodDelete, urlutil.MustJoinURL(c.dsn, dr.Context[sqlQueryIDContextKey].(string)), http.NoBody)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to avoid the cast here. Could the query ID be stored in *DruidRequest directly so it can be accessed like dr.queryID?

Comment on lines +446 to +450
for _, arg := range args {
if strings.HasPrefix(arg.Name, drivers.DialectDruid.ContextKeyArgPrefix()) {
queryCtx[strings.TrimPrefix(arg.Name, drivers.DialectDruid.ContextKeyArgPrefix())] = arg.Value
continue
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on https://pkg.go.dev/database/sql/driver (try searching for ErrRemoveArgument on this page for details):

NamedValueChecker also allows queries to accept per-query options as a parameter by returning ErrRemoveArgument from CheckNamedValue.

So two observations here:

  1. I think this logic should be adapted to use ErrRemoveArgument
  2. Instead of relying on name prefixes, I think the recommended approach is relying on type checks. E.g. check if v, ok := arg.(*queryContextParam); ok { ... } and expose a function like func QueryContextParam(key string, val any) any { return queryContextParam{...} } for use outside the package.

Comment on lines +658 to +660
if sn == nil {
return n, nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels weird because:

  1. The called function buildSpineSelect does not sound like a function name that would have side effects (and it didn't before, but now it does by creating olapContext).
  2. The early return seems unintuitive – if someone adds more functionality to this function later, they are likely to add it after the spine case handling, and might miss this early return.

@@ -689,7 +699,87 @@ func (a *AST) buildSpineSelect(alias string, spine *Spine, tr *TimeRange) (*Sele
}

if spine.TimeRange != nil {
return nil, errors.New("time_range not yet supported in spine")
if a.dialect == drivers.DialectDruid {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nowhere else in ast.go or astsql.go is there a hard-coded reference to a dialect. It would be very nice to avoid adding that now. Look at the various places that a.dialect are used – it always pushes the dialect-specific handling into the dialect implementation. I think something similar should be possible here.

Comment on lines +1085 to +1104
for _, cjs := range cpy.CrossJoinSelects {
for _, f := range cjs.DimFields {
s.DimFields = append(s.DimFields, FieldNode{
Name: f.Name,
DisplayName: f.DisplayName,
Expr: a.sqlForMember(cpy.Alias, f.Name),
})
}

if len(cjs.UnionAllSelects) > 0 {
// All dimensions will be same across UNION ALL SELECTS so we can just pick the first one
for _, f := range cjs.UnionAllSelects[0].DimFields {
s.DimFields = append(s.DimFields, FieldNode{
Name: f.Name,
DisplayName: f.DisplayName,
Expr: a.sqlForMember(cpy.Alias, f.Name),
})
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this behavior different than the handling of LeftJoinSelects and JoinComparisonSelect? I think this might relate to the previously mentioned issue of not setting FromSelect?

Ideally after the wrap, these can just be set to nil since they are handled in the nested select.

@@ -214,6 +215,10 @@ func (e *Executor) Query(ctx context.Context, qry *Query, executionTime *time.Ti
return nil, err
}

for k, v := range ast.olapContext {
args = append(args, sql2.NamedArg{Name: drivers.DialectDruid.ContextKeyArgPrefix() + k, Value: v})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we avoid a Druid dialect hard-coding here by having something like additionalArgs instead of olapContext?

Then when processing the spine in ast.go, the dialect might either return extra args or a range SELECT query.

@@ -124,6 +124,7 @@ type TimeSpine struct {
Start time.Time `mapstructure:"start"`
End time.Time `mapstructure:"end"`
Grain TimeGrain `mapstructure:"grain"`
Alias string `mapstructure:"alias"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do? Ideally it should apply only to the time dimension if it's requested, and in that case, the time dimension already has an alias provided in Dimension

Comment on lines +322 to +324
if q.TimeRange == nil || q.TimeRange.Start == nil || q.TimeRange.End == nil {
return nil, fmt.Errorf("time range is required for null fill")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might still work if there's a relative time range (like q.TimeRange.IsoDuration), and this might be needed e.g. for alerts/reports that use a time spine.

It would be resolved to a fixed time range before the query is turned into an AST. It happens here:

func (e *Executor) rewriteQueryTimeRanges(ctx context.Context, qry *Query, executionTime *time.Time) error {

Start: s,
End: e,
Grain: timeDim.Compute.TimeFloor.Grain,
Alias: timeDim.Name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be handled inside the metricsview package by applying it to any dimension that has a TimeFloor applied (and erroring if zero or multiple dimensions have a TimeFloor)

Copy link
Contributor

@begelundmuller begelundmuller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a separate comment, can we add some tests for this feature for all of DuckDB, Druid and ClickHouse in runtime/resolvers/testdata?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants