-
Support for "generic" airflow operators: you can now use regular python operators as part of your config files.
-
Support for “dbt docs” command to generate documentation for all dbt tasks: Users can now add “docs generate” as a target in their DOP configuration and additionally specify a GCS bucket with the
--bucket
and--bucket-path
options where documents are copied to. -
Serve dbt docs: Documents generated by dbt can be served as a web page by deploying the provided app on GAE. Note that deploying is an additional step that needs to be carried out after docs have been generated. See
infrastructure/dbt-docs/README.md
for details. -
dbt tasks artifacts
run_results
created by dbt tasks saved to BigQuery: This json file contains information on completed dbt invocations and is saved in the BQ table “run_results” for analysis and debugging. -
Add support for Airflow
v1.10.14
andv1.10.15
local environments: Users can specify which version they want to use by setting theAIRFLOW_VERSION
environment variable. -
Pre-commit linters: added pre-commit hooks to ensure python, yaml and some support for plain text file consistency in formatting and style throughout DOP codebase.
-
Ensure DAGs using the same DBT project do not run concurrently: Safety feature to safely allow selective execution of workflows by calling specific commands or tags (e.g.
dbt run --m
) within a single dbt project. This avoids creating inter-dependant workflows to avoid overriding each other's artifacts, since they will share the same target location (within the dbt container). -
Test time-partitioning: Time-partitioning of datetime type properly validated as part of schema validation.
-
Use Python 3.7 and dbt 0.19.1 in Composer K8s Operator
-
Add Dataflow example task: with the introduction of "regular" in the yaml config Airflow Operators, it is now possible to run compute intensive Dataflow jobs. Check
example_dataflow_template
for an example on how to implement a Dataflow pipeline.