Skip to content

Commit

Permalink
software-defined asset -> asset definition in concept page
Browse files Browse the repository at this point in the history
  • Loading branch information
sryza committed Apr 30, 2024
1 parent e61a8f2 commit aff1f69
Show file tree
Hide file tree
Showing 75 changed files with 257 additions and 260 deletions.
9 changes: 3 additions & 6 deletions docs/content/_apidocs.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -35,14 +35,11 @@ APIs from the core `dagster` package, divided roughly by topic:
<tbody>
<tr>
<td>
<a href="/_apidocs/assets">Software-defined Assets</a>
<a href="/_apidocs/assets">Asset definitions</a>
</td>
<td>
APIs to define data asset's using Dagster's{" "}
<a href="/concepts/assets/software-defined-assets">
Software-defined Assets
</a>
.
APIs to define data
<a href="/concepts/assets/software-defined-assets">assets</a>.
</td>
</tr>
<tr>
Expand Down
10 changes: 5 additions & 5 deletions docs/content/concepts.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -8,17 +8,17 @@ Learn about Dagster's core concepts and how to use them in your data platform.

---

## Software-defined Assets
## Asset definition

An asset is an object in persistent storage, such as a table, file, or persisted machine learning model. A Software-defined Asset is a Dagster object that couples an asset to the function and upstream assets used to produce its contents.
An asset is an object in persistent storage, such as a table, file, or persisted machine learning model. An asset definition is a Dagster object that couples an asset to the function and upstream assets used to produce its contents.

<ArticleList>
<ArticleListItem
title="Software-defined Assets"
title="Asset definitions"
href="/concepts/assets/software-defined-assets"
></ArticleListItem>
<ArticleListItem
title="Graph-backed Assets"
title="Graph-backed asset definitions"
href="/concepts/assets/graph-backed-assets"
></ArticleListItem>
<ArticleListItem
Expand Down Expand Up @@ -80,7 +80,7 @@ Dagster offers several ways to run data pipelines without manual intervention, i

## Partitions and backfills

A software-defined asset or job can represent a collection of _partitions_ that can be tracked and executed independently.
An asset defininition or job can represent a collection of _partitions_ that can be tracked and executed independently.

<ArticleList>
<ArticleListItem
Expand Down
2 changes: 1 addition & 1 deletion docs/content/concepts/assets/asset-checks.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ description: Asset checks are a way to define expectations about the quality of

# Asset checks

Dagster allows you to define and execute data quality checks on your [Software-defined Assets](/concepts/assets/software-defined-assets). Each asset check verifies some property of a data asset, e.g. that there are no null values in a particular column.
Dagster allows you to define and execute data quality checks on your [data assets](/concepts/assets/software-defined-assets). Each asset check verifies some property of a data asset, e.g. that there are no null values in a particular column.

When viewing an asset in Dagster’s UI, you can see all of its checks, and whether they’ve passed, failed, or haven’t run.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -376,7 +376,7 @@ height={1944}
href="/concepts/assets/asset-checks"
></ArticleListItem>
<ArticleListItem
title="Software-defined Assets"
title="Asset definitions"
href="/concepts/assets/software-defined-assets"
></ArticleListItem>
<ArticleListItem
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ description: Asset checks are a way to define expectations about the quality of

# Defining and executing asset checks

After creating some [Software-defined Assets](/concepts/assets/software-defined-assets), you may want to automate checks on the assets that test for data quality.
After creating some [asset definitions](/concepts/assets/software-defined-assets), you may want to automate checks on the assets that test for data quality.

In this guide, we'll show you a few approaches to defining asset checks, how to use check results to include helpful information, and how to execute checks.

Expand Down Expand Up @@ -373,7 +373,7 @@ Refer to the [Asset checks section](/concepts/testing#testing-asset-checks) of t
href="/concepts/assets/asset-checks"
></ArticleListItem>
<ArticleListItem
title="Software-defined Assets"
title="Asset definitions"
href="/concepts/assets/software-defined-assets"
></ArticleListItem>
<ArticleListItem
Expand Down
6 changes: 3 additions & 3 deletions docs/content/concepts/assets/asset-jobs.mdx
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
---
title: Asset jobs | Dagster
description: Asset jobs are the main unit for materializing and monitoring Software-defined assets in Dagster.
description: Asset jobs are the main unit for materializing and monitoring asset definitions in Dagster.
---

# Asset jobs

<Note>
Looking to execute a <a href="/concepts/ops-jobs-graphs/graphs">graph</a> of{" "}
<a href="/concepts/ops-jobs-graphs/ops">ops</a>, which aren't tied to
Software-defined Assets? Check out the{" "}
asset definitions? Check out the{" "}
<a href="/concepts/ops-jobs-graphs/op-jobs">Op jobs</a> documentation.
</Note>

[Jobs](/concepts/ops-jobs-graphs/jobs) are the main unit for executing and monitoring Software-defined assets in Dagster. An asset job is a type of \[job]\(/concepts/ops-jobs-graphs/jobs] that targets a selection of [Software-defined Assets](/concepts/assets/software-defined-assets) and can be launched:
[Jobs](/concepts/ops-jobs-graphs/jobs) are the main unit for executing and monitoring [asset definitions](/concepts/assets/software-defined-assets) in Dagster. An asset job is a type of \[job]\(/concepts/ops-jobs-graphs/jobs] that targets a selection of assets and can be launched:

- Manually from the Dagster UI
- At fixed intervals, by [schedules](/concepts/partitions-schedules-sensors/schedules)
Expand Down
2 changes: 1 addition & 1 deletion docs/content/concepts/assets/asset-selection-syntax.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -547,7 +547,7 @@ height={1508}

<ArticleList>
<ArticleListItem
title="Software-defined Assets"
title="Asset definitions"
href="/concepts/assets/software-defined-assets"
></ArticleListItem>
<ArticleListItem
Expand Down
2 changes: 1 addition & 1 deletion docs/content/concepts/assets/external-assets.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -363,7 +363,7 @@ defs = Definitions(

<ArticleList>
<ArticleListItem
title="Software-defined Assets"
title="Asset definitions"
href="/concepts/assets-software-defined-assets"
></ArticleListItem>
<ArticleListItem
Expand Down
6 changes: 3 additions & 3 deletions docs/content/concepts/assets/graph-backed-assets.mdx
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
---
title: Graph-Backed Assets | Dagster
description: Defining a software-defined asset with multiple discrete computations combined in an op graph.
description: An asset definition with multiple discrete computations combined in an op graph.
---

# Graph-Backed Assets

[Basic software-defined assets](/concepts/assets/software-defined-assets#a-basic-software-defined-asset) are computed using a single op. If generating an asset involves multiple discrete computations, you can use graph-backed assets by separating each computation into an op and assembling them into an op graph to combine your computations. This allows you to launch re-executions of runs at the op boundaries, but doesn't require you to link each intermediate value to an asset in persistent storage.
[Basic assets](/concepts/assets/software-defined-assets#a-basic-software-defined-asset) are computed using a single op. If generating an asset involves multiple discrete computations, you can use graph-backed assets by separating each computation into an op and assembling them into an op graph to combine your computations. This allows you to launch re-executions of runs at the op boundaries, but doesn't require you to link each intermediate value to an asset in persistent storage.

---

Expand Down Expand Up @@ -59,7 +59,7 @@ def slack_files_table():

### Defining managed-loading dependencies for graph-backed assets

Similar to software-defined assets, Dagster infers the upstream assets from the names of the arguments to the decorated function. Dagster will then delegate loading the data to an [I/O manager](/concepts/io-management/io-managers).
Similar to single-op asset definitions, Dagster infers the upstream assets from the names of the arguments to the decorated function. Dagster will then delegate loading the data to an [I/O manager](/concepts/io-management/io-managers).

The example below includes an asset named `middle_asset`. `middle_asset` depends on `upstream_asset`, and `downstream_asset` depends on `middle_asset`:

Expand Down
6 changes: 3 additions & 3 deletions docs/content/concepts/assets/multi-assets.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ description: A multi-asset represents a set of assets that are all updated by th

# Multi-Assets

A multi-asset represents a set of software-defined assets that are all updated by the same [op](/concepts/ops-jobs-graphs/ops) or [graph](/concepts/ops-jobs-graphs/graphs).
A multi-asset represents a set of asset definitions that are all updated by the same [op](/concepts/ops-jobs-graphs/ops) or [graph](/concepts/ops-jobs-graphs/graphs).

## Relevant APIs

Expand All @@ -15,7 +15,7 @@ A multi-asset represents a set of software-defined assets that are all updated b

## Overview

When working with [software-defined assets](/concepts/assets/software-defined-assets), it's sometimes inconvenient or impossible to map each persisted asset to a unique [op](/concepts/ops-jobs-graphs/ops) or [graph](/concepts/ops-jobs-graphs/graphs). A multi-asset is a way to define a single op or graph that will produce the contents of multiple data assets at once.
When working with [asset definitions](/concepts/assets/software-defined-assets), it's sometimes inconvenient or impossible to map each persisted asset to a unique [op](/concepts/ops-jobs-graphs/ops) or [graph](/concepts/ops-jobs-graphs/graphs). A multi-asset is a way to define a single op or graph that will produce the contents of multiple data assets at once.

Multi-assets may be useful in the following scenarios:

Expand All @@ -24,7 +24,7 @@ Multi-assets may be useful in the following scenarios:

## Defining multi-assets

The function responsible for computing the contents of any software-defined asset is an [op](/concepts/ops-jobs-graphs/ops). Multi-assets are responsible for updating multiple assets, so the underlying op will have multiple outputs -- one for each associated asset.
The function responsible for computing the contents of any asset is an [op](/concepts/ops-jobs-graphs/ops). Multi-assets are responsible for updating multiple assets, so the underlying op will have multiple outputs -- one for each associated asset.

### A basic multi-asset

Expand Down
32 changes: 16 additions & 16 deletions docs/content/concepts/assets/software-defined-assets.mdx
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
---
title: Software-defined assets | Dagster
description: A software-defined asset is a description of how to compute the contents of a particular data asset.
title: Asset definitions | Dagster
description: An asset definition is a description of how to compute the contents of a particular data asset.
---

# Software-defined assets
# Asset definitions

<Note>
Prefer videos? Check out our{" "}
Expand All @@ -14,20 +14,20 @@ description: A software-defined asset is a description of how to compute the con
<a href="https://www.youtube.com/watch?v=lRwpcyd6w8k" target="new">
demo
</a>{" "}
videos to get a quick look at Software-defined assets.
videos to get a quick look at asset definitions.
</Note>

An **asset** is an object in persistent storage, such as a table, file, or persisted machine learning model. A **software-defined asset** is a description, in code, of an asset that should exist and how to produce and update that asset.
An **asset** is an object in persistent storage, such as a table, file, or persisted machine learning model. A **asset definition** is a description, in code, of an asset that should exist and how to produce and update that asset.

Software-defined assets enable a declarative approach to data management, in which code is the source of truth on what data assets should exist and how those assets are computed.
Asset definitions enable a declarative approach to data management, in which code is the source of truth on what data assets should exist and how those assets are computed.

A software-defined asset includes the following:
An asset definition includes the following:

- An <PyObject object="AssetKey" />, which is a handle for referring to the asset.
- A set of upstream asset keys, which refer to assets that the contents of the software-defined asset are derived from.
- A set of upstream asset keys, which refer to assets that the contents of the asset definition are derived from.
- A Python function, which is responsible for computing the contents of the asset from its upstream dependencies and storing the results.

**Note**: Behind-the-scenes, the Python function is an [op](/concepts/ops-jobs-graphs/ops). Ops are an advanced topic that isn't required to get started with Dagster. A crucial distinction between Software-defined Assets and ops is that Software-defined Assets know about their dependencies, while ops do not. Ops aren't connected to dependencies until they're placed inside a [graph](/concepts/ops-jobs-graphs/graphs).
**Note**: Behind-the-scenes, the Python function is an [op](/concepts/ops-jobs-graphs/ops). Ops are an advanced topic that isn't required to get started with Dagster. A crucial distinction between asset definitions and ops is that asset definitions know about their dependencies, while ops do not. Ops aren't connected to dependencies until they're placed inside a [graph](/concepts/ops-jobs-graphs/graphs).

**Materializing** an asset is the act of running its function and saving the results to persistent storage. You can initiate materializations from [the Dagster UI](/concepts/webserver/ui) or by invoking Python APIs.

Expand All @@ -44,15 +44,15 @@ A software-defined asset includes the following:

## Defining assets

- [Basic software-defined assets](#a-basic-software-defined-asset)
- [Basic asset definitions](#a-basic-asset-definitions)
- [Assets with dependencies](#assets-with-dependencies)
- [Graph-backed assets and multi-assets](#graph-backed-assets-and-multi-assets)
- [Accessing asset context](#asset-context)
- [Configuring assets](#asset-configuration)

### A basic software-defined asset
### A basic asset definitions

The easiest way to create a software-defined asset is with the <PyObject object="asset" decorator /> decorator.
The easiest way to create an asset definitions is with the <PyObject object="asset" decorator /> decorator.

```python file=/concepts/assets/basic_asset_definition.py
import json
Expand All @@ -72,7 +72,7 @@ By default, the name of the decorated function, `my_asset`, is used as the asset

### Assets with dependencies

Software-defined assets can depend on other software-defined assets. In this section, we'll show you how to define:
Asset definitions can depend on other asset definitions. In this section, we'll show you how to define:

- [Basic asset dependencies](#defining-basic-dependencies)
- [Asset dependencies across code locations](#defining-asset-dependencies-across-code-locations)
Expand Down Expand Up @@ -142,9 +142,9 @@ defs = Definitions(assets=[code_location_2_asset])

### Graph-backed assets and multi-assets

If you'd like to define more complex assets, Dagster offers augmented software-defined asset abstractions:
If you'd like to define more complex assets, Dagster offers augmented asset definition abstractions:

- [Multi-assets](/concepts/assets/multi-assets): A set of software-defined assets that are all updated by the same [op](/concepts/ops-jobs-graphs/ops) or [graph](/concepts/ops-jobs-graphs/graphs).
- [Multi-assets](/concepts/assets/multi-assets): A set of asset definitions that are all updated by the same [op](/concepts/ops-jobs-graphs/ops) or [graph](/concepts/ops-jobs-graphs/graphs).
- [Graph-backed assets](/concepts/assets/graph-backed-assets): An asset whose computations are separated into multiple [ops](/concepts/ops-jobs-graphs/ops) that are combined to build a [graph](/concepts/ops-jobs-graphs/graphs). If the graph outputs multiple assets, the graph-backed asset is a [multi-asset](/concepts/assets/multi-assets).

### Asset configuration
Expand Down Expand Up @@ -510,7 +510,7 @@ def downstream_asset(upstream_asset):

## See it in action

For more examples of software-defined assets, check out these examples:
For more examples of asset definitions, check out these examples:

- In the [Fully Featured Project example](https://github.com/dagster-io/dagster/tree/master/examples/project_fully_featured):

Expand Down
2 changes: 1 addition & 1 deletion docs/content/concepts/automation.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ In this guide, we'll cover the available automation methods Dagser provides and

Before continuing, you should be familiar with:

- [Software-defined Assets][assets]
- [Asset definitions][assets]
- [Jobs][jobs] (_optional_)
- [Ops][ops] (_optional; advanced_)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ description: "Learn how to automate asset materialization using schedules and jo

# Automating assets using schedules and jobs

After creating some [Software-defined Assets](/concepts/assets/software-defined-assets), you may want to automate their materialization.
After creating some [asset definitions](/concepts/assets/software-defined-assets), you may want to automate their materialization.

In this guide, we'll show you one method of accomplishing this by using schedules and jobs. To do this for ops, refer to the [Automating ops using schedules guide](/concepts/automation/schedules/automating-ops-schedules-jobs).

Expand All @@ -24,7 +24,7 @@ To follow this guide, you'll need:

- **To install Dagster and the Dagster UI.** Refer to the [Installation guide](/getting-started/install) for more info and instructions.
- **Familiarity with**:
- [Software-defined Assets](/concepts/assets/software-defined-assets)
- [Asset definitions](/concepts/assets/software-defined-assets)
- [Jobs](/concepts/ops-jobs-graphs/jobs)
- [Code locations](/concepts/code-locations) (<PyObject object="Definitions" />)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ description: "Learn how to automate op execution using schedules and jobs."

# Automating ops using schedules and jobs

In this guide, we'll walk you through running ops on a schedule. To do this for Software-defined Assets, refer to the [Automating assets using schedules guide](/concepts/automation/schedules/automating-assets-schedules-jobs).
In this guide, we'll walk you through running ops on a schedule. To do this for asset definitions, refer to the [Automating assets using schedules guide](/concepts/automation/schedules/automating-assets-schedules-jobs).

By the end of this guide, you'll be able to:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ description: Job run configuration allows providing parameters to jobs at the ti

Run configuration allows providing parameters to jobs at the time they're executed.

It's often useful to provide user-chosen values to Dagster jobs or software-defined assets at runtime. For example, you might want to choose what dataset an op runs against, or provide a connection URL for a database resource. Dagster exposes this functionality through a configuration API.
It's often useful to provide user-chosen values to Dagster jobs or asset definitions at runtime. For example, you might want to choose what dataset an op runs against, or provide a connection URL for a database resource. Dagster exposes this functionality through a configuration API.

Various Dagster entities (ops, assets, resources) can be individually configured. When launching a job that executes (ops), materializes (assets), or instantiates (resources) a configurable entity, you can provide _run configuration_ for each entity. Within the function that defines the entity, you can access the passed-in configuration off of the `context`. Typically, the provided run configuration values correspond to a _configuration schema_ attached to the op/asset/resource definition. Dagster validates the run configuration against the schema and proceeds only if validation is successful.

Expand Down
Loading

0 comments on commit aff1f69

Please sign in to comment.