Integrating BigQuery + dbt with Dagster Cloud Insights (Experimental)#
This feature is considered experimental.
BigQuery costs can be integrated into the Dagster Insights UI. The dagster-cloud package contains utilities for capturing and submitting BigQuery cost metrics about data operations to Dagster Cloud.
BigQuery credentials which have access to the INFORMATION_SCHEMA.JOBS table (e.g. BigQuery Resource viewer role). These credentials should be provided used by your dbt profile. For more information on granting access to this table, see the BigQuery documentation.
First, instrument the Dagster @dbt_assets function with dbt_with_bigquery_insights:
from dagster_cloud.dagster_insights import dbt_with_bigquery_insights
@dbt_assets(...)defmy_asset(context: AssetExecutionContext, dbt: DbtCliResource):# Typically you have a `yield from dbt_resource.cli(...)`.# Wrap the original call with `dbt_with_bigquery_insights` as below.
dbt_cli_invocation = dbt_resource.cli(["build"], context=context)yieldfrom dbt_with_bigquery_insights(context, dbt_cli_invocation)
This passes through all underlying events and emits additional AssetObservations with BigQuery cost metrics. These metrics are obtained by querying the underlying INFORMATION_SCHEMA.JOBS table, using the BigQuery client from the dbt adapter.
First, instrument the op function with dbt_with_bigquery_insights:
from dagster_cloud.dagster_insights import dbt_with_bigquery_insights
@op(out={})defmy_dbt_op(context: OpExecutionContext, dbt: DbtCliResource):# Typically you have a `yield from dbt_resource.cli(...)`.# Wrap the original call with `dbt_with_bigquery_insights` as below.
dbt_cli_invocation = dbt.cli(["build"], context=context, manifest=dbt_manifest_path
)yieldfrom dbt_with_bigquery_insights(context, dbt_cli_invocation)@jobdefmy_dbt_job():...
my_dbt_op()...
This passes through all underlying events and emits additional AssetObservations with BigQuery cost metrics. These metrics are obtained by querying the underlying INFORMATION_SCHEMA.JOBS table, using the BigQuery client from the dbt adapter.
This allows you to add a comment, containing the dbt invocation ID and unique ID, to every query recorded in BigQuery's INFORMATION_SCHEMA.JOBS table. Using this data, Insights will attribute cost metrics in BigQuery to the corresponding Dagster jobs and assets.