-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
3 changed files
with
105 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,105 @@ | ||
--- | ||
title: Dagster Cloud Insights | ||
description: "Visibility into historical usage and cost metrics." | ||
|
||
platform_type: "cloud" | ||
--- | ||
|
||
# Dagster Cloud Insights | ||
|
||
<Note> | ||
This feature is considered <strong>experimental</strong>. To get access to | ||
Insights please request access in the [#dagster-insights](https://dagster.slack.com/archives/C05V7GETFSQ) channel in Slack. | ||
</Note> | ||
|
||
Insights is Dagster Cloud feature that provides visibility into historical usage and cost metrics such as run duration, credit usage and failures. This feature is available as a top level tab in the Dagster Cloud UI: | ||
|
||
<Image | ||
alt="Viewing the Insights tab in the Dagster UI" | ||
src="/images/dagster-cloud/insights/insights-tab.png" | ||
width={771} | ||
height={536} | ||
/> | ||
|
||
The Insights page shows a list of metrics in the left panel. For each metric the daily, weekly or monthly aggregated values are shown in a graph in the main panel. As of October 2023 the metrics are update once a day. | ||
|
||
## External metrics | ||
|
||
External metrics such as Snowflake credits spent can be integrated in the Dagster Insights UI. The [`dagster-cloud`](https://pypi.org/project/dagster-cloud/) package contains utilities for capturing and submitting external metrics about data operations to Dagster Cloud via an API. | ||
|
||
### How to enable Snowflake and dbt with Insights | ||
|
||
If you use dbt to materialize tables in Snowflake, you can use these instructions to integrate Snowflake metrics into the Insights UI. | ||
|
||
#### Step 1 - Instrument your dbt asset definition | ||
|
||
You need `dagster-cloud` version 1.5.1 or newer. Instrument the dagster `@dbt_assets` function with `dbt_with_snowflake_insights`. | ||
|
||
This passes through all the underlying events and in addition emits an `AssetObservation` for each materialization. This observation contains the dbt invocation id and unique id that get recorded in the Dagster event log. | ||
|
||
```python | ||
from dagster_cloud.dagster_insights import dbt_with_snowflake_insights | ||
@dbt_assets(...) | ||
def my_asset(context: AssetExecutionContext): | ||
# Typically you have a `yield from dbt_resource.cli(...)`. | ||
# Wrap the original call with `dbt_with_snowflake_insights` as below. | ||
dbt_cli_invocation = dbt_resource.cli(["build"], context=context) | ||
yield from dbt_with_snowflake_insights(context, dbt_cli_invocation) | ||
``` | ||
|
||
#### Step 2 - Update your dbt_project.yml | ||
|
||
Add the following to your `dbt_project.yml`: | ||
|
||
```yaml | ||
query-comment: | ||
comment: "snowflake_dagster_dbt_v1_opaque_id[[[{{ node.unique_id }}:{{ invocation_id }}]]]" | ||
append: true | ||
``` | ||
This adds a comment to each query recorded in the `query_history` table in Snowflake. The comment contains the dbt unique id and invocation id. Here `append: true` is important since Snowflake strips leading comments. | ||
|
||
#### Step 3 - Create a metrics ingestion pipeline | ||
|
||
Create a Dagster pipeline that joins asset observation events with the Snowflake query history and calls the Dagster Cloud ingestion API. This needs a Snowflake resource that can query `query_history`. You can use a pre-defined pipeline as below: | ||
|
||
```python | ||
from datetime import date | ||
from dagster_snowflake import SnowflakeResource | ||
from dagster import Definition, EnvVar | ||
from dagster_cloud.dagster_insights import ( | ||
create_snowflake_insights_asset_and_schedule, | ||
) | ||
snowflake_insights_definitions = create_snowflake_insights_asset_and_schedule( | ||
date(2023, 10, 5), | ||
allow_partial_partitions=True, | ||
dry_run=False, | ||
snowflake_resource_key="snowflake_insights", | ||
) | ||
defs = Definitions( | ||
assets=[..., *snowflake_insights_definitions.assets], | ||
schedules=[..., snowflake_insights_deifnitions.schedule], | ||
resources={ | ||
..., | ||
"snowflake_insights": SnowflakeResource( | ||
account=EnvVar("SNOWFLAKE_PURINA_ACCOUNT"), | ||
user=EnvVar("SNOWFLAKE_PURINA_USER"), | ||
password=EnvVar("SNOWFLAKE_PURINA_PASSWORD"), | ||
), | ||
} | ||
) | ||
``` | ||
|
||
The `snowflake_resource_key` is a SnowflakeResource that has access to the `query_history` table. Once the pipeline runs, Snowflake credits should be visible in the Insights tab: | ||
|
||
<Image | ||
alt="Snowflake credtis in the Dagster UI" | ||
src="/images/dagster-cloud/insights/insights-snowflake.png" | ||
width={383} | ||
height={349} | ||
/> | ||
|
||
--- |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.