Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically run ANALYZE after babelfish extension create/upgrade #1922

Conversation

sumitj824
Copy link
Contributor

@sumitj824 sumitj824 commented Oct 16, 2023

Description

Earlier, when having less than 50 schemas on the fresh instance causes the poor performance for problematic query when having large number of objects in database. In case of less than 50 schemas, sys.babelfish_namespace_ext catalog which stores the name of schema will have less than 50 rows which means autovacuum ANALYZE has never been run by PostgreSQL autovacuum daemon as the number of inserted rows doesn't exceeds the threshold criteria. Since the sys.babelfish_namespace_ext catalog has never been analyzed it leads bad query planning for large query. This commit add changes to automatically run ANALYZE after babelfish extension create/upgrade for all babelfish catalogs to avoid such issues in future.

Signed-off-by: Sumit Jaiswal [email protected]

Issues Resolved

Task: BABEL-4487

Test Scenarios Covered

Performance tests

Execution time for Problematic query:

select count(*) from sys.tables AS tbl 
inner join sys.all_columns as clmns on clmns.object_id = tbl.object_id 
inner join sys.types as usrt on usrt.user_type_id = clmns.user_type_id;

Before ANALYZE: 29 min query plan
After ANALYZE: 2 sec query plan

USE CASE BASED

Manually tested with creating table for which analyze will fail

jdbc_testdb=# call sys.analyze_babelfish_catalogs();
WARNING:  ANALYZE for babelfish catalog sys.test_table failed with error: data out of range for datetimes
CALL

Logs:

WARNING:  ANALYZE for babelfish catalog sys.test_table failed with error: data out of range for datetimes
CONTEXT:  PL/pgSQL function sys.analyze_babelfish_catalogs() line 17 at RAISE

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is under the terms of the Apache 2.0 and PostgreSQL licenses, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.

For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@sumitj824 sumitj824 changed the title Run analyze after creating the extension Automatically run ANALYZE after babelfish extension create/upgrade Oct 16, 2023
Sumit Jaiswal added 5 commits October 16, 2023 08:07
Deepesh125
Deepesh125 previously approved these changes Oct 23, 2023
forestkeeper
forestkeeper previously approved these changes Oct 23, 2023
EXECUTE format('ANALYZE %I.%I', schema_name, babelfish_catalog.name);
EXCEPTION WHEN OTHERS THEN
GET STACKED DIAGNOSTICS error_msg = MESSAGE_TEXT;
RAISE WARNING 'ANALYZE for babelfish catalog %.% failed with error: %s', schema_name, babelfish_catalog.name, error_msg;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we have this warning in the upgrade, what's our action item ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning will help us to debug any performance issue after upgrade due the failure. I will add the same in Jira also.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how will it lead to action on customer or our side? A warning will go to upgrade scripts and might be lost if no one takes a look.

Copy link
Contributor Author

@sumitj824 sumitj824 Oct 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, warning will not lead to any action but it will be helpful for any performance issue observed just after the upgrade and logs are present.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us track this item via a Jira. Talk to Sharu about this. We should consider how we track successful MVUs and corresponding logs in long term to resolve issues down the line.

KushaalShroff
KushaalShroff previously approved these changes Oct 24, 2023
EXECUTE format('ANALYZE %I.%I', schema_name, babelfish_catalog.name);
EXCEPTION WHEN OTHERS THEN
GET STACKED DIAGNOSTICS error_msg = MESSAGE_TEXT;
RAISE WARNING 'ANALYZE for babelfish catalog %.% failed with error: %s', schema_name, babelfish_catalog.name, error_msg;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how will it lead to action on customer or our side? A warning will go to upgrade scripts and might be lost if no one takes a look.

Sumit Jaiswal added 3 commits October 24, 2023 11:55
… during source

latest to target latest

Signed-off-by: Sumit Jaiswal <[email protected]>
Signed-off-by: Sumit Jaiswal <[email protected]>
Signed-off-by: Sumit Jaiswal <[email protected]>
Deepesh125
Deepesh125 previously approved these changes Oct 25, 2023
Signed-off-by: Sumit Jaiswal <[email protected]>
Signed-off-by: Sumit Jaiswal <[email protected]>
@KushaalShroff KushaalShroff merged commit 354feb0 into babelfish-for-postgresql:BABEL_3_X_DEV Oct 25, 2023
27 checks passed
@KushaalShroff KushaalShroff deleted the sumiji-dev-analyze branch October 25, 2023 16:38
sumitj824 added a commit to amazon-aurora/babelfish_extensions that referenced this pull request Oct 26, 2023
…abelfish-for-postgresql#1922)

Earlier, when having less than 50 schemas on the fresh instance causes the poor performance for problematic query when having large number of objects in database. In case of less than 50 schemas, sys.babelfish_namespace_ext catalog which stores the name of schema will have less than 50 rows which means autovacuum ANALYZE has never been run by PostgreSQL autovacuum daemon as the number of inserted rows doesn't exceeds the threshold criteria. Since the sys.babelfish_namespace_ext catalog has never been analyzed it leads bad query planning for large query. This commit add changes to automatically run ANALYZE after babelfish extension create/upgrade for all babelfish catalogs to avoid such issues in future.

Issues Resolved
Task: BABEL-4487

Signed-off-by: Sumit Jaiswal [email protected]
Deepesh125 pushed a commit that referenced this pull request Oct 26, 2023
…1922) (#1958)

* Automatically run ANALYZE after babelfish extension create/upgrade (#1922)

Earlier, when having less than 50 schemas on the fresh instance causes the poor performance for problematic query when having large number of objects in database. In case of less than 50 schemas, sys.babelfish_namespace_ext catalog which stores the name of schema will have less than 50 rows which means autovacuum ANALYZE has never been run by PostgreSQL autovacuum daemon as the number of inserted rows doesn't exceeds the threshold criteria. Since the sys.babelfish_namespace_ext catalog has never been analyzed it leads bad query planning for large query. This commit add changes to automatically run ANALYZE after babelfish extension create/upgrade for all babelfish catalogs to avoid such issues in future.

Issues Resolved
Task: BABEL-4487

Signed-off-by: Sumit Jaiswal <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants