Concepts
Dagster provides a variety of abstractions for building and orchestrating data pipelines. These concepts enable a modular, declarative approach to data engineering, making it easier to manage dependencies, monitor execution, and ensure data quality.
Asset
An asset represents a logical unit of data such as a table, dataset, or machine learning model. Assets can have dependencies on other assets, forming the data lineage for your pipelines. As the core abstraction in Dagster, assets can interact with many other Dagster entities to facilitate certain tasks. When you define an asset, either with the @dg.asset decorator or via a component, the definition is automatically added to a top-level Definitions object.
To dive deeper into how assets shape the way you design your data platform, check out our deep-dive on assets.
| Concept | Relationship |
|---|---|
| asset check | asset may use an asset check |
| asset spec | asset is described by an asset spec |
| component | asset may be programmatically built by a component |
| config | asset may use a config |
| definitions | asset is added to a top-level Definitions object to be deployed |
| io manager | asset may use a io manager |
| partition | asset may use a partition |
| resource | asset may use a resource |
| job | asset may be used in a job |
| schedule | asset may be used in a schedule |
| sensor | asset may be used in a sensor |
Asset check
An asset_check is associated with an asset to ensure it meets certain expectations around data quality, freshness or completeness. Asset checks run when the asset is executed and store metadata about the related run and if all the conditions of the check were met.
| Concept | Relationship |
|---|---|
| asset | asset check may be used by an asset |
| definitions | asset check is added to a top-level Definitions object to be deployed |
Asset spec
Specs are standalone objects that describe the identity and metadata of Dagster entities without defining their behavior. For example, an AssetSpec contains essential information like the asset's key (its unique identifier) and tags (labels for organizing and annotating the asset), but it doesn't include the logic for materializing that asset.
| Concept | Relationship |
|---|---|
| asset | asset spec may describe the identity and metadata of an asset |
Code location
A code location is a collection of Dagster entity definitions deployed in a specific environment. A code location determines the Python environment (including the version of Dagster being used as well as any other Python dependencies). A Dagster project can have multiple code locations, helping isolate dependencies.
| Concept | Relationship |
|---|---|
| definitions | code location must contain at least one top-level Definitions object |