You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently using the spark-bigquery-connector with a DataProc cluster for my data processing tasks. I would like to configure the DataProc cluster to publish BigQuery metrics to Google Cloud Platform (GCP) for better monitoring and analysis.
I understand that DataProc supports custom metrics as described in the custom metric collection guide. Specifically, I am interested in capturing metrics related to scanned bytes, as this information is crucial for accountability and tracking the cost of the service.
Could you please provide guidance on how to set up the DataProc cluster to enable publishing BigQuery metrics, particularly scanned bytes? Specifically, I am looking for details on:
Any required configurations or properties that need to be set in the DataProc cluster.
How to ensure that the metrics are properly published to GCP.
Any additional steps or best practices for setting up metric collection and monitoring for BigQuery jobs executed via DataProc.
Thank you for your assistance!
Best regards,
Giuseppe.
The text was updated successfully, but these errors were encountered:
Publishing BigQuery metrics is currently on our roadmap.
But we do have metrics corresponding to read streams in the form of logs which can be accessed. Please use the latest connector version 0.39.0.
You can grep the logs by ReadStream Metrics to get those metrics.
The read session metrics are also available on Spark UI.
Currently, you would need to setup your own log processing if you would like to build dashboards on top of those metrics.
Hello,
I am currently using the spark-bigquery-connector with a DataProc cluster for my data processing tasks. I would like to configure the DataProc cluster to publish BigQuery metrics to Google Cloud Platform (GCP) for better monitoring and analysis.
I understand that DataProc supports custom metrics as described in the custom metric collection guide. Specifically, I am interested in capturing metrics related to scanned bytes, as this information is crucial for accountability and tracking the cost of the service.
Could you please provide guidance on how to set up the DataProc cluster to enable publishing BigQuery metrics, particularly scanned bytes? Specifically, I am looking for details on:
Any required configurations or properties that need to be set in the DataProc cluster.
How to ensure that the metrics are properly published to GCP.
Any additional steps or best practices for setting up metric collection and monitoring for BigQuery jobs executed via DataProc.
Thank you for your assistance!
Best regards,
Giuseppe.
The text was updated successfully, but these errors were encountered: