Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define event_monitoring_live_v1 views in view.sql files #4576

Merged
merged 18 commits into from
Dec 8, 2023
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
344c091
Define `event_monitoring_live_v1` views in `view.sql` files.
sean-rose Nov 16, 2023
ffb4abe
Overwrite existing `event_monitoring_live_v1` views on deployment.
sean-rose Nov 16, 2023
a472eb3
Support materialized views in view naming validation.
sean-rose Nov 16, 2023
08da23b
Merge branch 'main' into event_monitoring_live_v1-views
sean-rose Nov 16, 2023
eb7e0c3
Update `EventMonitoringLive.no_init` logic.
sean-rose Nov 16, 2023
de673f3
Apply formatting required by Black.
sean-rose Nov 16, 2023
4372bd9
Don't try to overwrite existing `event_monitoring_live_v1` views on d…
sean-rose Nov 17, 2023
efe99b0
Handle `IF NOT EXISTS` in view naming validation.
sean-rose Nov 17, 2023
12808ce
Use regular expression to extract view ID in view naming validation.
sean-rose Nov 28, 2023
7a6651b
Update other view regular expressions to allow for materialized views.
sean-rose Nov 28, 2023
43c3944
Merge branch 'main' into event_monitoring_live_v1-views
sean-rose Nov 28, 2023
deb8007
Merge branch 'main' into event_monitoring_live_v1-views
sean-rose Nov 28, 2023
e6ee23f
Merge branch 'main' into event_monitoring_live_v1-views
sean-rose Nov 28, 2023
cd79f3b
Merge branch 'main' into event_monitoring_live_v1-views
sean-rose Nov 30, 2023
b83abad
Merge branch 'main' into event_monitoring_live_v1-views
sean-rose Nov 30, 2023
7e0774d
Merge branch 'main' into event_monitoring_live_v1-views
sean-rose Nov 30, 2023
601207f
Merge branch 'main' into event_monitoring_live_v1-views
sean-rose Nov 30, 2023
b81b482
Merge branch 'main' into event_monitoring_live_v1-views
sean-rose Dec 8, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 19 additions & 7 deletions bigquery_etl/view/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -190,16 +190,28 @@ def _valid_view_naming(self):
for t in parsed.tokens
if not (t.is_whitespace or isinstance(t, sqlparse.sql.Comment))
]
is_view_statement = (
" ".join(tokens[0].normalized.split()) == "CREATE OR REPLACE"
and tokens[1].normalized == "VIEW"
is_view_statement = " ".join(tokens[0].normalized.split()) in (
"CREATE",
"CREATE OR REPLACE",
) and (
tokens[1].normalized == "VIEW"
or " ".join(t.normalized for t in tokens[1:3]).upper()
== "MATERIALIZED VIEW"
)
if is_view_statement:
target_view = str(tokens[2]).strip().split()[0]
try:
[project_id, dataset_id, view_id] = target_view.replace("`", "").split(
"."
view_kw_index = 1 if tokens[1].normalized == "VIEW" else 2
if (
" ".join(
t.normalized for t in tokens[view_kw_index + 1 : view_kw_index + 4]
)
== "IF NOT EXISTS"
):
view_id_token = tokens[view_kw_index + 4]
else:
view_id_token = tokens[view_kw_index + 1]
target_view = str(view_id_token).replace("`", "").strip().split()[0]
try:
[project_id, dataset_id, view_id] = target_view.split(".")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This took me a minute to parse. Not sure if some comments might make sense here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was annoyed how verbose this logic ended up being. I was hoping we could just grab the first identifier token, but unfortunately sqlparse currently misclassifies the MATERIALIZED keyword as an identifier.

Actually, since this isn't a pressing issue, I think I'll wait for them to fix the bug I submitted, then this can be simplified to just look for the first identifier token.

Or an alternate approach would be using sqlparse just to remove comments from the SQL, then using regular expressions to extract the view ID.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or an alternate approach would be using sqlparse just to remove comments from the SQL, then using regular expressions to extract the view ID.

I've implemented this alternate approach.

if not (
self.name == view_id
and self.dataset == dataset_id
Expand Down
2 changes: 1 addition & 1 deletion bqetl_project.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -188,7 +188,7 @@ dry_run:
- sql/moz-fx-data-shared-prod/org_mozilla_firefox_beta_derived/experiment_events_live_v1/init.sql
- sql/moz-fx-data-shared-prod/telemetry_derived/experiment_enrollment_cumulative_population_estimate_v1/view.sql
- sql/moz-fx-data-shared-prod/telemetry/experiment_enrollment_cumulative_population_estimate/view.sql
- sql/moz-fx-data-shared-prod/**/event_monitoring_live_v1/init.sql
- sql/moz-fx-data-shared-prod/**/event_monitoring_live_v1/view.sql
- sql/moz-fx-data-shared-prod/monitoring/event_monitoring_live/view.sql
# Already exists (and lacks an "OR REPLACE" clause)
- sql/moz-fx-data-shared-prod/org_mozilla_firefox_derived/clients_first_seen_v1/init.sql
Expand Down
29 changes: 14 additions & 15 deletions sql_generators/glean_usage/event_monitoring_live.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ class EventMonitoringLive(GleanTable):

def __init__(self) -> None:
"""Initialize materialized view generation."""
self.no_init = False
self.no_init = True
self.per_app_id_enabled = True
self.per_app_enabled = False
self.across_apps_enabled = True
Expand All @@ -38,9 +38,10 @@ def __init__(self) -> None:
def generate_per_app_id(
self, project_id, baseline_table, output_dir=None, use_cloud_function=True
):
"""Generate per-app_id views."""
tables = table_names_from_baseline(baseline_table, include_project_id=False)

init_filename = f"{self.target_table_id}.init.sql"
view_filename = f"{self.target_table_id}.view.sql"
metadata_filename = f"{self.target_table_id}.metadata.yaml"

table = tables[f"{self.prefix}"]
Expand Down Expand Up @@ -68,23 +69,21 @@ def generate_per_app_id(
Artifact = namedtuple("Artifact", "table_id basename sql")
artifacts = []

if not self.no_init:
init_sql = render(
init_filename, template_folder=PATH / "templates", **render_kwargs
)
metadata = render(
metadata_filename,
template_folder=PATH / "templates",
format=False,
**render_kwargs,
)
artifacts.append(Artifact(table, "metadata.yaml", metadata))
view_sql = render(
view_filename, template_folder=PATH / "templates", **render_kwargs
)
metadata = render(
metadata_filename,
template_folder=PATH / "templates",
format=False,
**render_kwargs,
)
artifacts.append(Artifact(table, "metadata.yaml", metadata))

skip_existing_artifact = self.skip_existing(output_dir, project_id)

if output_dir:
if not self.no_init:
artifacts.append(Artifact(table, "init.sql", init_sql))
artifacts.append(Artifact(table, "view.sql", view_sql))

for artifact in artifacts:
destination = (
Expand Down