Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add subdaily granularity #5882

Merged
merged 23 commits into from
Aug 15, 2024
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
ef816e5
add bits
mirnawong1 Jul 31, 2024
6626a1c
add timespine
mirnawong1 Jul 31, 2024
b49a75b
add
mirnawong1 Aug 2, 2024
c5a594d
new page and rn
mirnawong1 Aug 2, 2024
23b373b
Merge branch 'current' into sub-granularity
mirnawong1 Aug 2, 2024
0c8362e
Merge branch 'current' into sub-granularity
mirnawong1 Aug 2, 2024
f510116
Update website/docs/docs/dbt-versions/release-notes.md
mirnawong1 Aug 2, 2024
97d5125
Merge branch 'current' into sub-granularity
mirnawong1 Aug 2, 2024
7700ae1
Merge branch 'current' into sub-granularity
mirnawong1 Aug 12, 2024
e517b12
Merge branch 'current' into sub-granularity
mirnawong1 Aug 12, 2024
9028011
update time spine and dimensions docs
Jstein77 Aug 9, 2024
31a5fc1
updates for sub daily granualrity
Jstein77 Aug 13, 2024
0cf05df
Merge branch 'current' into sub-granularity
Jstein77 Aug 13, 2024
ffb1b2e
spelling + grammar updates
Jstein77 Aug 13, 2024
23ac774
Merge branch 'current' into sub-granularity
runleonarun Aug 13, 2024
ce7dd8a
Apply suggestions from code review
matthewshaver Aug 14, 2024
695a1f6
address comments
Jstein77 Aug 14, 2024
84e0c1e
Update semantic-models.md
Jstein77 Aug 14, 2024
eb7a262
Update metrics-overview.md
Jstein77 Aug 14, 2024
8b71b5b
Apply suggestions from code review
matthewshaver Aug 15, 2024
8289f21
Merge branch 'current' into sub-granularity
matthewshaver Aug 15, 2024
2a35c8d
Apply suggestions from code review
matthewshaver Aug 15, 2024
44d405f
Update website/docs/docs/build/metricflow-time-spine.md
matthewshaver Aug 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 5 additions & 7 deletions website/docs/docs/build/dimensions.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,14 +105,13 @@ dimensions:
## Time

:::tip use datetime data type if using BigQuery
To use BigQuery as your data platform, time dimensions columns need to be in the datetime data type. If they are stored in another type, you can cast them to datetime using the `expr` property. Time dimensions are used to group metrics by different levels of time, such as day, week, month, quarter, and year. MetricFlow supports these granularities, which can be specified using the `time_granularity` parameter.
To use BigQuery as your data platform, time dimensions columns need to be in the datetime data type. If they are stored in another type, you can cast them to datetime using the `expr` property. Time dimensions are used to group metrics by different levels of time, such as sub-daily like hour, or week, month, quarter, year, and so on. MetricFlow supports these granularities, which can be specified using the `time_granularity` parameter.
:::

Time has additional parameters specified under the `type_params` section. When you query one or more metrics in MetricFlow using the CLI, the default time dimension for a single metric is the aggregation time dimension, which you can refer to as `metric_time` or use the dimensions' name.

You can use multiple time groups in separate metrics. For example, the `users_created` metric uses `created_at`, and the `users_deleted` metric uses `deleted_at`:


```bash
# dbt Cloud users
dbt sl query --metrics users_created,users_deleted --group-by metric_time__year --order-by metric_time__year
Expand All @@ -121,8 +120,7 @@ dbt sl query --metrics users_created,users_deleted --group-by metric_time__year
mf query --metrics users_created,users_deleted --group-by metric_time__year --order-by metric_time__year
```


You can set `is_partition` for time or categorical dimensions to define specific time spans. Additionally, use the `type_params` section to set `time_granularity` to adjust aggregation detail (like daily, weekly, and so on):
You can set `is_partition` for time or categorical dimensions to define specific time spans. Additionally, use the `type_params` section to set `time_granularity` to adjust aggregation detail (like sub-daily (hourly), daily, weekly, and so on). For more sub-daily configuration details, refer to [sub-daily granularity](/docs/build/granularity).

<Tabs>

Expand Down Expand Up @@ -173,7 +171,7 @@ measures:

<TabItem value="time_gran" label="time_granularity">

`time_granularity` specifies the smallest level of detail that a measure or metric should be reported at, such as daily, weekly, monthly, quarterly, or yearly. Different granularity options are available, and each metric must have a specified granularity. For example, a metric specified with weekly granularity couldn't be aggregated to a daily grain.
`time_granularity` specifies the smallest level of detail that a measure or metric should be reported at, such as [sub-daily](/docs/build/granularity), daily, weekly, monthly, quarterly, or yearly. Different granularity options are available, and each metric must have a specified granularity. For example, a metric specified with weekly granularity couldn't be aggregated to a daily grain.

The current options for time granularity are day, week, month, quarter, and year.

Expand All @@ -187,14 +185,14 @@ dimensions:
expr: date_trunc('day', ts_created) # ts_created is the underlying column name from the table
is_partition: True
type_params:
time_granularity: day
time_granularity: hour # or second, or millisecond etc
- name: deleted_at
type: time
label: "Date of deletion"
expr: date_trunc('day', ts_deleted) # ts_deleted is the underlying column name from the table
is_partition: True
type_params:
time_granularity: day
time_granularity: hour # or second, or millisecond etc

measures:
- name: users_deleted
Expand Down
43 changes: 40 additions & 3 deletions website/docs/docs/build/metricflow-time-spine.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,11 @@ sidebar_label: "MetricFlow time spine"
tags: [Metrics, Semantic Layer]
---

MetricFlow uses a timespine table to construct cumulative metrics. By default, MetricFlow expects the timespine table to be named `metricflow_time_spine` and doesn't support using a different name.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mirnawong1 For this entire file - can we use this deleted text for versions <=1.8, instead of the new text? The new text should only be for 1.9+.

MetricFlow uses a timespine table to construct cumulative metrics. By default, MetricFlow expects the timespine table to be named `metricflow_time_spine` and doesn't support using a different name.

To create this table, you need to create a model in your dbt project called `metricflow_time_spine` and add the following code:
To create this table, you need to create a model in your dbt project called `metricflow_time_spine` and add the following code. This example uses a `day` granularity to generate a table with one row per day. This is useful for metrics that need a daily aggregation.

<File name='metricflow_time_spine.sql'>
<File name='metricflow_time_spine_day.sql'>

<VersionBlock lastVersion="1.6">

Expand Down Expand Up @@ -129,3 +129,40 @@ from final
</VersionBlock>

You only need to include the `date_day` column in the table. MetricFlow can handle broader levels of detail, but it doesn't currently support finer grains.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mirnawong1 for old versions, we can update this to say:
"...but finer grains are only supported in versions 1.9+."


## Hourly time spine

This example uses `dbt.date_spine` with an `hour` granularity to generate a table with one row per hour. This is needed for hourly data aggregation and other sub-daily analyses.

WHAT ARE OTHER OPTIONS?? TO ADD BOTH, DO USERS NEED TWO FILES (HOUR AND DAY) OR CAN THEY BE COMBINED?

<File name='metricflow_time_spine_hour.sql'>

```sql
-- filename: metricflow_time_spine_hour.sql
{{
config(
materialized = 'table',
)
}}

with hours as (

{{
dbt.date_spine(
'hour',
"to_date('01/01/2000','mm/dd/yyyy')",
"to_date('01/01/2030','mm/dd/yyyy')"
)
}}

),

final as (
select cast(date_hour as timestamp) as date_hour
from hours
)

select * from final
```
</File>
58 changes: 48 additions & 10 deletions website/docs/docs/build/metrics-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ pagination_next: "docs/build/cumulative"

Once you've created your semantic models, it's time to start adding metrics. Metrics can be defined in the same YAML files as your semantic models, or split into separate YAML files into any other subdirectories (provided that these subdirectories are also within the same dbt project repo).

The keys for metrics definitions are:
This page explains the different supported metric types you can add to your dbt project. The keys for metrics definitions are:
matthewshaver marked this conversation as resolved.
Show resolved Hide resolved

<!-- for v1.8 and higher -->

Expand All @@ -27,6 +27,8 @@ The keys for metrics definitions are:

Here's a complete example of the metrics spec configuration:

<File name="models/metrics/file_name.yml" >

```yaml
metrics:
- name: metric name ## Required
Expand All @@ -42,6 +44,8 @@ metrics:
{{ Dimension('entity__name') }} > 0 and {{ Dimension(' entity__another_name') }} is not
null and {{ Metric('metric_name', group_by=['entity_name']) }} > 5
```

</File>
</VersionBlock>

<!-- for v1.7 and lower -->
Expand All @@ -61,6 +65,8 @@ metrics:

Here's a complete example of the metrics spec configuration:

<File name="models/metrics/file_name.yml" >

```yaml
metrics:
- name: metric name ## Required
Expand All @@ -76,19 +82,26 @@ metrics:
{{ Dimension('entity__name') }} > 0 and {{ Dimension(' entity__another_name') }} is not
null and {{ Metric('metric_name', group_by=['entity_name']) }} > 5
```
</File>

</VersionBlock>

This page explains the different supported metric types you can add to your dbt project.

import SLCourses from '/snippets/_sl-course.md';

<SLCourses/>

### Conversion metrics
## Sub-daily granularity

Sub-daily granularity enables you to query metrics at granularities at finer time grains below a day, such as hourly, minute, or even by the second. It support anything that `date_trunc` supports. This can be useful if you want more detailed analysis and for datasets where you need more granular time data, such as minute-by-minute event tracking.

For more configuration details, refer to [sub-daily granularity](/docs/build/granularity).

## Conversion metrics

[Conversion metrics](/docs/build/conversion) help you track when a base event and a subsequent conversion event occur for an entity within a set time period.

<File name="models/metrics/file_name.yml" >

```yaml
metrics:
- name: The metric name
Expand All @@ -112,11 +125,14 @@ metrics:
- base_property: DIMENSION or ENTITY
conversion_property: DIMENSION or ENTITY
```
</File>

### Cumulative metrics
## Cumulative metrics

[Cumulative metrics](/docs/build/cumulative) aggregate a measure over a given window. If no window is specified, the window will accumulate the measure over all of the recorded time period. Note that you will need to create the [time spine model](/docs/build/metricflow-time-spine) before you add cumulative metrics.

<File name="models/metrics/file_name.yml" >

```yaml
# Cumulative metrics aggregate a measure over a given window. The window is considered infinite if no window parameter is passed (accumulate the measure over all of time)
metrics:
Expand All @@ -130,11 +146,14 @@ metrics:
join_to_timespine: true
window: 7 days
```
</File>

### Derived metrics
## Derived metrics

[Derived metrics](/docs/build/derived) are defined as an expression of other metrics. Derived metrics allow you to do calculations on top of metrics.

<File name="models/metrics/file_name.yml" >

```yaml
metrics:
- name: order_gross_profit
Expand All @@ -149,6 +168,8 @@ metrics:
- name: order_cost
alias: cost
```
</File>

<!-- not supported
### Expression metrics
Use [expression metrics](/docs/build/expr) when you're building a metric that involves a SQL expression of multiple measures.
Expand All @@ -167,10 +188,12 @@ metrics:
```
-->

### Ratio metrics
## Ratio metrics

[Ratio metrics](/docs/build/ratio) involve a numerator metric and a denominator metric. A `filter` string can be applied to both the numerator and denominator or separately to the numerator or denominator.

<File name="models/metrics/file_name.yml" >

```yaml
metrics:
- name: cancellation_rate
Expand All @@ -191,15 +214,18 @@ metrics:
filter: |
{{ Dimension('customer__country') }} = 'MX'
```
</File>

### Simple metrics
## Simple metrics

[Simple metrics](/docs/build/simple) point directly to a measure. You may think of it as a function that takes only one measure as the input.

- `name` &mdash; Use this parameter to define the reference name of the metric. The name must be unique amongst metrics and can include lowercase letters, numbers, and underscores. You can use this name to call the metric from the dbt Semantic Layer API.

**Note:** If you've already defined the measure using the `create_metric: True` parameter, you don't need to create simple metrics. However, if you would like to include a constraint on top of the measure, you will need to create a simple type metric.

<File name="models/metrics/file_name.yml" >

```yaml
metrics:
- name: cancellations
Expand All @@ -214,13 +240,16 @@ metrics:
{{ Dimension('order__value')}} > 100 and {{Dimension('user__acquisition')}} is not null
join_to_timespine: true
```
</File>

## Filters

A filter is configured using Jinja templating. Use the following syntax to reference entities, dimensions, time dimensions, or metrics in filters.

Refer to [Metrics as dimensions](/docs/build/ref-metrics-in-filters) for details on how to use metrics as dimensions with metric filters:

<File name="models/metrics/file_name.yml" >

```yaml
filter: |
{{ Entity('entity_name') }}
Expand All @@ -232,10 +261,19 @@ filter: |
{{ TimeDimension('time_dimension', 'granularity') }}

filter: |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mirnawong1 Can we version block this metric filter example to versions 1.8+?

{{ Metric('metric_name', group_by=['entity_name']) }} # Available in v1.8 or go versionless with [Keep on latest version](/docs/dbt-versions/upgrade-dbt-version-in-cloud#keep-on-latest-version)
{{ Metric('metric_name', group_by=['entity_name']) }} {# Available in v1.8 or go [versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#keep-on-latest-version). }
```

</File>

For example, if I wanted to filter for the order date dimension, grouped by month, I'd use the following syntax:
matthewshaver marked this conversation as resolved.
Show resolved Hide resolved

```yaml
filter: |
{{ TimeDimension('order_date', 'month') }}
```

### Further configuration
## Further configuration

You can set more metadata for your metrics, which can be used by other tools later on. The way this metadata is used will vary based on the specific integration partner

Expand Down
2 changes: 1 addition & 1 deletion website/docs/docs/build/semantic-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ semantic_models:
- name: transaction_date
type: time
type_params:
time_granularity: day
time_granularity: day # Additional options include sub-daily like hour, week, month, quarter, year, and so on.

- name: transaction_location
type: categorical
Expand Down
2 changes: 1 addition & 1 deletion website/docs/docs/build/simple.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Simple metrics are metrics that directly reference a single measure, without any

The parameters, description, and type for simple metrics are:

:::tip
:::tip
Note that we use the double colon (::) to indicate whether a parameter is nested within another parameter. So for example, `query_params::metrics` means the `metrics` parameter is nested under `query_params`.
:::

Expand Down
64 changes: 64 additions & 0 deletions website/docs/docs/build/sub-daily-granularity.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
---
title: Sub-daily granularity
id: "granularity"
description: "Sub-daily granularity enables you to query metrics at granularities at finer time grains below a day, such as hourly, minute, or even by the second. "
sidebar_label: "Sub-daily granularity"
tags: [Metrics, Semantic Layer]
pagination_next: "docs/build/conversion"
---


Sub-daily granularity enables you to query metrics at granularities at finer time grains such as hourly, minute, or even by the second. It support anything that `date_trunc` supports. You can sub-daily granularity for cumulative metrics, time spine models at sub-daily grains, or default grain settings for metrics.

MetricFlow defaults to the `day` grain, while allowing you the ability to default granularity as a metric-level property.

This is particularly useful for more detailed analysis and for datasets where high-resolution time data is required, such as minute-by-minute event tracking.

### Usage
There are two ways to specify sub-daily granularity: `default_grain` and `time_granularity`. This section explains how to use both methods, how they also interact, and which one takes precedence.

- #### `default_grain`
Use the `default_grain` parameter in the metric-level metric config to specify the default granularity for querying the metric when no specific granularity is defined. It allows specifying the most common or sensible default, like day, hour, and so on.

This parameter is optional and defaults to `day`.

<File name="models/metrics/file_name.yml" >

```yaml
metrics:
- name: my_metrics
...
default_grain: day # Optional: defaults to day
```
</File>

- #### `time_granularity`
Use the `time_granularity` parameter at the dimension-level with the [time dimension](/docs/build/dimensions#time) `type_params` to specify the level of granularity directly on the data, like hour, minute, second, and so on. It affects how the data is truncated or aggregated in queries.

<File name="models/metrics/file_name.yml" >

```yaml
dimensions:
- name: ordered_at
type: time
type_params:
time_granularity: hour

```
</File>

### Precedence
When querying metrics by `metric_time`, MetricFlow currently defaults to the grain of the `agg_time_dimension`. If you want to query metrics at a different grain, you can use the `time_granularity` type parameter in time dimensions.

The following table explains how `default_grain` and `time_granularity` interact and the resulting query granularity:

| Context | `default_grain` | `time_granularity` | Result |
| --- | --- | --- | --- |
| Only `default_grain` specified | `day` | `hour` | Query at `hour` granularity |
| Only `time_granularity` specified | - | `hour` | Query at `hour` granularity |
| Both specified, same value | `hour` | `hour` | Query at `hour` granularity |
| Both specified, different value | `day` | `minute` | Query at `minute` granularity |
| Both not specified | - | - | Defaults to `day` |

Implementation using the `time_granularity` type parameter in time dimensions.
Examples of using DATE_TRUNC with sub-daily granularities in SQL.
3 changes: 3 additions & 0 deletions website/docs/docs/dbt-versions/release-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@ Release notes are grouped by month for both multi-tenant and virtual private clo

[^*] The official release date for this new format of release notes is May 15th, 2024. Historical release notes for prior dates may not reflect all available features released earlier this year or their tenancy availability.

## August 2024
- **New**: You can configure metrics at granularities at finer time grains, such as hourly, minute, or even by the second. This is particularly useful for more detailed analysis and for datasets where high-resolution time data is required, such as minute-by-minute event tracking. Refer to [sub-daily granularity](/docs/build/sub-daily-granularity) for more info.
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved

## July 2024
- **New:** [Connections](/docs/cloud/connect-data-platform/about-connections#connection-management) are now available under **Account settings** as a global setting. Previously, they were found under **Project settings**. This is being rolled out in phases over the coming weeks.
- **New:** Admins can now assign [environment-level permissions](/docs/cloud/manage-access/environment-permissions) to groups for specific roles.
Expand Down
1 change: 1 addition & 0 deletions website/sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -381,6 +381,7 @@ const sidebarSettings = {
link: { type: "doc", id: "docs/build/metrics-overview" },
items: [
"docs/build/metrics-overview",
"docs/build/granularity",
matthewshaver marked this conversation as resolved.
Show resolved Hide resolved
"docs/build/conversion",
"docs/build/cumulative",
"docs/build/derived",
Expand Down
Loading