Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-aggregations give incorrect result for rolling_window measure #8745

Open
igorlukanin opened this issue Sep 25, 2024 · 0 comments
Open

Pre-aggregations give incorrect result for rolling_window measure #8745

igorlukanin opened this issue Sep 25, 2024 · 0 comments
Assignees
Labels
backend:server Issues relating to Cube Core's Server bug Something isn't working cube store Issues relating to Cube Store

Comments

@igorlukanin
Copy link
Member

igorlukanin commented Sep 25, 2024

Describe the bug
The same query yields different results when executed with and without pre-aggregations. Looks like incorrect SQL is generated for Cube Store.

To Reproduce

  1. Use the data model shown below.
  2. Run the following query:
{
  "measures": [
    "Order.rollingCount"
  ],
  "limit": 5000,
  "timeDimensions": [
    {
      "dimension": "Order.createdAt",
      "dateRange": [
        "2023-07-11",
        "2023-07-18"
      ]
    }
  ]
}
  1. Get NULL as a result:
Screenshot 2024-09-25 at 18 28 41 4. Check that the generated SQL is as follows (with a suspicious wrapping around `base`):
SELECT
  sum(
    CASE
      WHEN null <= to_timestamp(?) THEN `order__rolling_count`
    END
  ) `order__rolling_count`
FROM
  (
    SELECT
      date_trunc('day', `order__created_at_day`) `order__created_at_day`,
      sum(`order__rolling_count`) `order__rolling_count`
    FROM
      prod_pre_aggregations.order_total_by_day AS `order__total_by_day`
    WHERE
      (`order__created_at_day` <= to_timestamp(?))
    GROUP BY
      1
  ) `base`

Expected behavior
4. Comment out the pre-aggregation in the data model
6. Run the same query
7. Get 4 as the result
Screenshot 2024-09-25 at 18 30 38
8. See that different SQL is generated (no wrappings around base):

SELECT
  "order__rolling_count" "order__rolling_count"
FROM
  (
    SELECT
      count("order_rolling_count_cumulative__order".id) "order__rolling_count"
    FROM
      (
        SELECT
          1 AS id,
          '2023-07-01T00:00:00.000Z' :: TIMESTAMP AS created_at
        UNION ALL
        SELECT
          2 AS id,
          '2023-07-05T00:00:00.000Z' :: TIMESTAMP AS created_at
        UNION ALL
        SELECT
          3 AS id,
          '2023-07-11T00:00:00.000Z' :: TIMESTAMP AS created_at
        UNION ALL
        SELECT
          4 AS id,
          '2023-07-15T00:00:00.000Z' :: TIMESTAMP AS created_at
        UNION ALL
        SELECT
          5 AS id,
          '2023-07-21T00:00:00.000Z' :: TIMESTAMP AS created_at
        UNION ALL
        SELECT
          6 AS id,
          '2023-07-25T00:00:00.000Z' :: TIMESTAMP AS created_at
        UNION ALL
        SELECT
          7 AS id,
          '2023-07-29T00:00:00.000Z' :: TIMESTAMP AS created_at
        UNION ALL
        SELECT
          8 AS id,
          '2023-08-02T00:00:00.000Z' :: TIMESTAMP AS created_at
        UNION ALL
        SELECT
          9 AS id,
          '2023-08-06T00:00:00.000Z' :: TIMESTAMP AS created_at
        UNION ALL
        SELECT
          10 AS id,
          '2023-08-12T00:00:00.000Z' :: TIMESTAMP AS created_at
        UNION ALL
        SELECT
          11 AS id,
          '2023-08-16T00:00:00.000Z' :: TIMESTAMP AS created_at
        UNION ALL
        SELECT
          12 AS id,
          '2023-08-22T00:00:00.000Z' :: TIMESTAMP AS created_at
      ) AS "order_rolling_count_cumulative__order"
    WHERE
      (
        (
          "order_rolling_count_cumulative__order".created_at :: timestamptz AT TIME ZONE 'UTC'
        ) <= ($ 1 :: timestamptz :: timestamptz AT TIME ZONE 'UTC')
      )
  ) as q_0
LIMIT
  5000

Screenshots
If applicable, add screenshots to help explain your problem.

Minimally reproducible Cube Schema

cube(`Order`, {
  sql: `
    SELECT 1 AS id, '2023-07-01T00:00:00.000Z'::TIMESTAMP AS created_at UNION ALL
    SELECT 2 AS id, '2023-07-05T00:00:00.000Z'::TIMESTAMP AS created_at UNION ALL
    SELECT 3 AS id, '2023-07-11T00:00:00.000Z'::TIMESTAMP AS created_at UNION ALL
    SELECT 4 AS id, '2023-07-15T00:00:00.000Z'::TIMESTAMP AS created_at UNION ALL
    SELECT 5 AS id, '2023-07-21T00:00:00.000Z'::TIMESTAMP AS created_at UNION ALL
    SELECT 6 AS id, '2023-07-25T00:00:00.000Z'::TIMESTAMP AS created_at UNION ALL
    SELECT 7 AS id, '2023-07-29T00:00:00.000Z'::TIMESTAMP AS created_at UNION ALL
    SELECT 8 AS id, '2023-08-02T00:00:00.000Z'::TIMESTAMP AS created_at UNION ALL
    SELECT 9 AS id, '2023-08-06T00:00:00.000Z'::TIMESTAMP AS created_at UNION ALL
    SELECT 10 AS id, '2023-08-12T00:00:00.000Z'::TIMESTAMP AS created_at UNION ALL
    SELECT 11 AS id, '2023-08-16T00:00:00.000Z'::TIMESTAMP AS created_at UNION ALL
    SELECT 12 AS id, '2023-08-22T00:00:00.000Z'::TIMESTAMP AS created_at
  `,
  
  measures: {
    rollingCount: {
      sql: `id`,
      type: `count`,
      rollingWindow: {
        trailing: `unbounded`,
      },
    },
  },

  dimensions: {
      createdAt: {
        sql: `created_at`,
        type: `time`
      },
  },

  preAggregations: {
    totalByDay: {
      measures: [
        Order.rollingCount
      ],
      timeDimension: Order.createdAt,
      granularity: `day`,
      refreshKey: {
        every: `1 minute`,
      },
    },
  }
})

Version:
v0.36.2

Additional context
Found while investigating #6951

@igorlukanin igorlukanin added bug Something isn't working backend:server Issues relating to Cube Core's Server cube store Issues relating to Cube Store labels Sep 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:server Issues relating to Cube Core's Server bug Something isn't working cube store Issues relating to Cube Store
Projects
None yet
Development

No branches or pull requests

2 participants