add subdaily granularity #5882

mirnawong1 · 2024-08-02T15:12:49Z

resolves #5857
resolves #5908

this pr adds draft content to explain subdaily granularities in MF.

[ ] Needs PM review
[ ] Needs docs review

Outstanding questions

Can the user use both the default_grain and time_granularity? and how does it connect to the time_spine and when should a user it what? or is it up to them?
What should we communicate wrt cumulative metrics?
@Jstein77 do you think we should use the new 'sub-daily' page to explain granularities in general?

vercel · 2024-08-02T15:12:53Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Updated (UTC)
docs-getdbt-com	✅ Ready (Inspect)	Visit Preview	Aug 15, 2024 2:46pm

website/docs/docs/dbt-versions/release-notes.md

website/docs/docs/build/semantic-models.md

website/docs/docs/dbt-versions/release-notes.md

Editorial changes

courtneyholcomb · 2024-08-14T03:32:12Z

website/docs/docs/build/dimensions.md


 <!--dimensions are non-aggregatable expressions that define the level of aggregation for a metric used to define how data is sliced or grouped in a metric. Since groups can't be aggregated, they're considered to be a property of the primary or unique entity of the table.

 Groups are defined within semantic models, alongside entities and measures, and correspond to non-aggregatable columns in your dbt model that provides categorical or time-based context. In SQL, dimensions  is typically included in the GROUP BY clause.-->

-All dimensions require a `name`, `type` and in some cases, an `expr` parameter. The `name` for your dimension must be unique to the semantic model and can not be the same as an existing `entity` or `measure` within that same model.
+All dimensions require a `name` and `type` and, in some cases, can optionally include an `expr` parameter. The `name` for your Dimension must be unique within the same semantic model.


It seems redundant to say both "in some cases" and "optionally" - maybe pick one or the other?

Yup good call. I will update.

courtneyholcomb · 2024-08-14T03:36:05Z

website/docs/docs/build/dimensions.md

    - name: is_bulk_transaction
      type: categorical
      expr: case when quantity > 10 then true else false end
 ```

-MetricFlow requires that all dimensions have a primary entity. This is to guarantee unique dimension names. If your data source doesn't have a primary entity, you need to assign the entity a name using the `primary_entity: entity_name` key. It doesn't necessarily have to map to a column in that table and assigning the name doesn't affect query generation.
+Dimensions are bound to the primary entity of the semantic model in which they are defined. For example, if a dimension called `is_bulk_transaction` is defined in a model with `transaction` as a primary entity, then `is_bulk_transaction` is scoped to the `transaction` entity. To reference this dimension you would use the fully qualified dimension name `transaction__is_bulk_transaction`. 


Might be nice to use an example dimension name that makes it somewhat clear why we bind it to the entity name. E.g., something like transaction__country or just changing the name to something like transaction__is_bulk would make this feel less redundant.

courtneyholcomb · 2024-08-14T03:37:36Z

website/docs/docs/build/dimensions.md

-MetricFlow requires that all dimensions have a primary entity. This is to guarantee unique dimension names. If your data source doesn't have a primary entity, you need to assign the entity a name using the `primary_entity: entity_name` key. It doesn't necessarily have to map to a column in that table and assigning the name doesn't affect query generation.
+Dimensions are bound to the primary entity of the semantic model in which they are defined. For example, if a dimension called `is_bulk_transaction` is defined in a model with `transaction` as a primary entity, then `is_bulk_transaction` is scoped to the `transaction` entity. To reference this dimension you would use the fully qualified dimension name `transaction__is_bulk_transaction`. 
+
+MetricFlow requires that all semantic models have a primary entity. This is to guarantee unique dimension names. If your data source doesn't have a primary entity, you need to assign the entity a name using the `primary_entity` key. It doesn't necessarily have to map to a column in that table and assigning the name doesn't affect query generation. An example of defining a primary entity for a data source that doesn't have a primary entity column is below:


Can we add that for a virtual primary entity like this, you should try to make the name unique? I don't think we enforce that (we should) but it's definitely helpful

courtneyholcomb · 2024-08-14T03:43:12Z

website/docs/docs/build/dimensions.md


-The current options for time granularity are day, week, month, quarter, and year. 
+Any granularity supported by your engine's `date_trunc` function will work, with the most common granularities being hour, day, week, month, quarter, and year.


This isn't quite accurate (e.g., look at the options available for snowflake). Might be better to just list the options we support.
For sub-daily options, we support these for all engines unless otherwise noted):

nanosecond (snowflake only)

microsecond (all engines except trino)

millisecond

second

minute

hour

courtneyholcomb · 2024-08-14T03:44:21Z

website/docs/docs/build/dimensions.md


-Aggregation between metrics with different granularities is possible, with the Semantic Layer returning results at the highest granularity by default. For example, when querying two metrics with daily and monthly granularity, the resulting aggregation will be at the monthly level.
+Aggregation between metrics with different granularities is possible, with the Semantic Layer returning results at the coarser granularity by default. For example, when querying two metrics with daily and monthly granularity, the resulting aggregation will be at the monthly level.


I think coarsest would be grammatically correct here

courtneyholcomb · 2024-08-14T03:46:54Z

website/docs/docs/build/metricflow-time-spine.md


-<File name='metricflow_time_spine.sql'>
+If you already have a date dimension or time spine table in your dbt project you can simply point MetricFlow at this table. To do this, update the `model` configuration to use this table in the semantic layer. For example, given the following directory structure, you can create two time spine configurations, `time_spine_hourly` and `time_spine_daily`.


Do you think people migrating from the old time spine will think they need to rename the model? Not sure if we want to add a note about that (that you can keep the old name) to avoid confusion!

Added an note about this.

courtneyholcomb · 2024-08-14T03:49:39Z

website/docs/docs/build/metricflow-time-spine.md

+
+
+If you need to create a time spine table from scratch, add the following code to your dbt project. 
+The example creates a time spine at a daily grain and an hourly grain. We recommend creating both an hourly and daily time spine, MetricFlow will use the appropriate time spine based on the granularity of the metric selected to minimize data scans.


Can we add some more detail here? Some things I think it would be helpful to know:

MetricFlow will use the time spine with the largest compatible granularity for a given query to ensure the most efficient query possible

You can add a time spine for each granularity you intend to use if minor query efficiency is more important to you than setup time / space constraints

We recommend having a time spine at the finest grain used in any of your dimensions to avoid unexpected errors

Added more context

courtneyholcomb · 2024-08-14T03:51:05Z

website/docs/docs/build/metrics-overview.md

-### Conversion metrics
+## Default granularity for metircs
+
+It's possible to define a default time granularity for metrics that differs from the granularity of the default aggregation time dimensions (`metric_time`). This is useful if your time dimension has a very fine grain, like second or hour, but you typically query metrics rolled up at a coarser grain. The granularity can be set using the `time_granularity` parameter on the metric and defaults to `day`.


Would note that while it defaults to day, if day is not available because the dimension is defined at a coarser granularity, it will default to the defined granularity for the dimension!

courtneyholcomb · 2024-08-14T03:52:17Z

website/docs/docs/build/semantic-models.md

@@ -84,7 +84,7 @@ semantic_models:
      - name: transaction_date
        type: time
        type_params:
-          time_granularity: day
+          time_granularity: day # Additional options include hour, week, month, quarter, year, and so on.


seems weird to exclude other options like second and below if we're going to list so many. do we need this list at all?

courtneyholcomb · 2024-08-14T03:54:09Z

website/docs/docs/build/semantic-models.md

- MetricFlow requires all dimensions to be tied to a primary entity.
+Dimensions have the following characteristics:
+
+- There are two types of dimensions: categorical and time. Categorical dimensions are for things you can't measure in numbers, while time dimensions represent dates.


"...while time dimensions represent dates and timestamps"

website/docs/docs/build/metricflow-time-spine.md

website/docs/docs/dbt-versions/release-notes.md

website/sidebars.js

website/docs/docs/build/metricflow-time-spine.md

courtneyholcomb

@mirnawong1 leaving comments here for what should be version blocked!

courtneyholcomb · 2024-10-08T21:21:23Z

website/docs/docs/build/dimensions.md

@@ -173,28 +161,34 @@ measures:

 <TabItem value="time_gran" label="time_granularity">

-`time_granularity` specifies the smallest level of detail that a measure or metric should be reported at, such as daily, weekly, monthly, quarterly, or yearly. Different granularity options are available, and each metric must have a specified granularity. For example, a metric specified with weekly granularity couldn't be aggregated to a daily grain. 
+`time_granularity` specifies the grain of a time dimension. MetricFlow will transform the underlying column to the specified granularity. For example, if you add hourly granularity to a time dimension column, MetricFlow will run a `date_trunc` function to convert the timestamp to hourly. You can easily change the time grain at query time and aggregate it to a coarser grain, for example, from hourly to monthly. However, you can't go from a coarser grain to a finer grain (monthly to hourly).


@mirnawong1 This section mentions hourly granularity, which isn't available for <=1.8. We should keep this section for 1.9+, but can we swap the word "hourly" with "daily" for <=1.8?

courtneyholcomb · 2024-10-08T21:22:25Z

website/docs/docs/build/dimensions.md


-The current options for time granularity are day, week, month, quarter, and year. 
+Our supported granularities are:
+* nanosecond (Snowflake only)


@mirnawong1 These sub-daily granularity options are showing up for all versions. Can we keep them all for 1.9+, but remove anything smaller than day for <=1.8?

courtneyholcomb · 2024-10-08T21:22:55Z

website/docs/docs/build/dimensions.md

    is_partition: True 
    type_params:
-      time_granularity: day
+      time_granularity: hour 


@mirnawong1 Can we swap in day instead of hour for <=1.8?

courtneyholcomb · 2024-10-08T21:25:12Z

website/docs/docs/build/metricflow-time-spine.md

@@ -6,11 +6,45 @@ sidebar_label: "MetricFlow time spine"
 tags: [Metrics, Semantic Layer]
 ---

-MetricFlow uses a timespine table to construct cumulative metrics. By default, MetricFlow expects the timespine table to be named `metricflow_time_spine` and doesn't support using a different name.


@mirnawong1 For this entire file - can we use this deleted text for versions <=1.8, instead of the new text? The new text should only be for 1.9+.

courtneyholcomb · 2024-10-08T21:27:41Z

website/docs/docs/build/metricflow-time-spine.md

 ```

 </VersionBlock>

-You only need to include the `date_day` column in the table. MetricFlow can handle broader levels of detail, but it doesn't currently support finer grains.


@mirnawong1 for old versions, we can update this to say:
"...but finer grains are only supported in versions 1.9+."

courtneyholcomb · 2024-10-08T21:28:59Z

website/docs/docs/build/metrics-overview.md

 import SLCourses from '/snippets/_sl-course.md';

 <SLCourses/>

-### Conversion metrics
+## Default granularity for metircs


@mirnawong1 This whole section called "Default granularity for metrics" should be version blocked to 1.9+.
Also noting that "metrics" is misspelled in the title (though maybe that's already fixed in production!)

courtneyholcomb · 2024-10-08T21:30:18Z

website/docs/docs/build/metrics-overview.md

@@ -232,10 +283,20 @@ filter: |
  {{ TimeDimension('time_dimension', 'granularity') }}

 filter: |  


@mirnawong1 Can we version block this metric filter example to versions 1.8+?

mirnawong1 added 4 commits July 31, 2024 15:06

add bits

ef816e5

add timespine

6626a1c

add

b49a75b

new page and rn

c5a594d

mirnawong1 requested a review from a team as a code owner August 2, 2024 15:12

github-actions bot added content Improvements or additions to content size: medium This change will take up to a week to address Docs team Authored by the Docs team @dbt Labs labels Aug 2, 2024

mirnawong1 added 2 commits August 2, 2024 16:14

Merge branch 'current' into sub-granularity

23b373b

Merge branch 'current' into sub-granularity

0c8362e

vercel bot had a problem deploying to Preview August 2, 2024 15:31 Failure

mirnawong1 commented Aug 2, 2024

View reviewed changes

website/docs/docs/dbt-versions/release-notes.md Outdated Show resolved Hide resolved

Update website/docs/docs/dbt-versions/release-notes.md

f510116

vercel bot deployed to Preview August 2, 2024 15:43 View deployment

Merge branch 'current' into sub-granularity

97d5125

vercel bot deployed to Preview August 2, 2024 17:16 View deployment

Merge branch 'current' into sub-granularity

7700ae1

vercel bot deployed to Preview August 12, 2024 11:03 View deployment

Merge branch 'current' into sub-granularity

e517b12

vercel bot deployed to Preview August 12, 2024 20:26 View deployment

update time spine and dimensions docs

9028011

vercel bot had a problem deploying to Preview August 13, 2024 16:34 Failure

updates for sub daily granualrity

31a5fc1

github-actions bot added size: large This change will more than a week to address and might require more than one person and removed size: medium This change will take up to a week to address labels Aug 13, 2024

vercel bot had a problem deploying to Preview August 13, 2024 18:50 Failure

Merge branch 'current' into sub-granularity

0cf05df

vercel bot had a problem deploying to Preview August 13, 2024 18:52 Failure

spelling + grammar updates

ffb1b2e

matthewshaver reviewed Aug 14, 2024

View reviewed changes

website/docs/docs/build/semantic-models.md Outdated Show resolved Hide resolved

matthewshaver reviewed Aug 14, 2024

View reviewed changes

website/docs/docs/build/semantic-models.md Outdated Show resolved Hide resolved

matthewshaver reviewed Aug 14, 2024

View reviewed changes

website/docs/docs/dbt-versions/release-notes.md Outdated Show resolved Hide resolved

Apply suggestions from code review

ce7dd8a

Editorial changes

vercel bot had a problem deploying to Preview August 14, 2024 03:07 Failure

courtneyholcomb reviewed Aug 14, 2024

View reviewed changes

address comments

695a1f6

vercel bot had a problem deploying to Preview August 14, 2024 23:54 Failure

Jstein77 added 2 commits August 14, 2024 16:57

Update semantic-models.md

84e0c1e

Update metrics-overview.md

eb7a262

vercel bot had a problem deploying to Preview August 14, 2024 23:59 Failure

matthewshaver reviewed Aug 15, 2024

View reviewed changes

website/docs/docs/build/metricflow-time-spine.md Outdated Show resolved Hide resolved

matthewshaver reviewed Aug 15, 2024

View reviewed changes

website/docs/docs/dbt-versions/release-notes.md Outdated Show resolved Hide resolved

matthewshaver reviewed Aug 15, 2024

View reviewed changes

website/sidebars.js Outdated Show resolved Hide resolved

Apply suggestions from code review

8b71b5b

matthewshaver approved these changes Aug 15, 2024

View reviewed changes

Merge branch 'current' into sub-granularity

8289f21

vercel bot had a problem deploying to Preview August 15, 2024 14:24 Failure

matthewshaver reviewed Aug 15, 2024

View reviewed changes

website/docs/docs/build/metricflow-time-spine.md Outdated Show resolved Hide resolved

matthewshaver reviewed Aug 15, 2024

View reviewed changes

website/docs/docs/build/metricflow-time-spine.md Outdated Show resolved Hide resolved

matthewshaver reviewed Aug 15, 2024

View reviewed changes

website/docs/docs/build/metricflow-time-spine.md Outdated Show resolved Hide resolved

Apply suggestions from code review

2a35c8d

vercel bot deployed to Preview August 15, 2024 14:35 View deployment

matthewshaver reviewed Aug 15, 2024

View reviewed changes

website/docs/docs/build/metricflow-time-spine.md Outdated Show resolved Hide resolved

Update website/docs/docs/build/metricflow-time-spine.md

44d405f

vercel bot deployed to Preview August 15, 2024 14:46 View deployment

matthewshaver merged commit 81b7c04 into current Aug 15, 2024
10 checks passed

matthewshaver deleted the sub-granularity branch August 15, 2024 15:08

dbeatty10 mentioned this pull request Sep 11, 2024

Time spines without YAML configuration are in the process of deprecation. Please add YAML configuration for your 'metricflow_time_spine' model #5908

Closed

1 task

courtneyholcomb reviewed Oct 8, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add subdaily granularity #5882

add subdaily granularity #5882

mirnawong1 commented Aug 2, 2024 •

edited by dbeatty10

Loading

vercel bot commented Aug 2, 2024 •

edited

Loading

courtneyholcomb Aug 14, 2024

Jstein77 Aug 14, 2024

courtneyholcomb Aug 14, 2024

Jstein77 Aug 14, 2024

courtneyholcomb Aug 14, 2024

Jstein77 Aug 14, 2024

courtneyholcomb Aug 14, 2024

Jstein77 Aug 14, 2024

courtneyholcomb Aug 14, 2024

courtneyholcomb Aug 14, 2024

Jstein77 Aug 14, 2024

courtneyholcomb Aug 14, 2024

Jstein77 Aug 14, 2024

courtneyholcomb Aug 14, 2024

courtneyholcomb Aug 14, 2024

courtneyholcomb Aug 14, 2024

courtneyholcomb left a comment

courtneyholcomb Oct 8, 2024

courtneyholcomb Oct 8, 2024

courtneyholcomb Oct 8, 2024

courtneyholcomb Oct 8, 2024

courtneyholcomb Oct 8, 2024

courtneyholcomb Oct 8, 2024

courtneyholcomb Oct 8, 2024


		The current options for time granularity are day, week, month, quarter, and year.
		Any granularity supported by your engine's `date_trunc` function will work, with the most common granularities being hour, day, week, month, quarter, and year.


		Aggregation between metrics with different granularities is possible, with the Semantic Layer returning results at the highest granularity by default. For example, when querying two metrics with daily and monthly granularity, the resulting aggregation will be at the monthly level.
		Aggregation between metrics with different granularities is possible, with the Semantic Layer returning results at the coarser granularity by default. For example, when querying two metrics with daily and monthly granularity, the resulting aggregation will be at the monthly level.


		<File name='metricflow_time_spine.sql'>
		If you already have a date dimension or time spine table in your dbt project you can simply point MetricFlow at this table. To do this, update the `model` configuration to use this table in the semantic layer. For example, given the following directory structure, you can create two time spine configurations, `time_spine_hourly` and `time_spine_daily`.



		If you need to create a time spine table from scratch, add the following code to your dbt project.
		The example creates a time spine at a daily grain and an hourly grain. We recommend creating both an hourly and daily time spine, MetricFlow will use the appropriate time spine based on the granularity of the metric selected to minimize data scans.

		@@ -232,10 +283,20 @@ filter: \|
		{{ TimeDimension('time_dimension', 'granularity') }}

		filter: \|

add subdaily granularity #5882

add subdaily granularity #5882

Conversation

mirnawong1 commented Aug 2, 2024 • edited by dbeatty10 Loading

Outstanding questions

vercel bot commented Aug 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

courtneyholcomb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mirnawong1 commented Aug 2, 2024 •

edited by dbeatty10

Loading

vercel bot commented Aug 2, 2024 •

edited

Loading