Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark table_constraints_internal() function as parallel safe and set current_db_id for parallel workers #3322

Merged

Conversation

sumitj824
Copy link
Contributor

@sumitj824 sumitj824 commented Dec 26, 2024

Description

The INFORMATION_SCHEMA_TSQL.TABLE_CONSTRAINTS_INTERNAL() function was recently introduced
but not marked as PARALLEL SAFE. This prevented the query optimizer from utilizing parallel
plans in queries involving this function, leading to suboptimal performance for certain operations.

By marking the function as PARALLEL SAFE, we enable the use of parallel query execution plans,
which can significantly improve performance for large datasets.

This commit also addresses the issue of empty result sets in queries joining with sys.db_id()
or sys.db_name() in Enforced Parallel Query mode by setting the current_db_id in addition to
current_db_name for parallel workers.

Signed-off-by: Sumit Jaiswal [email protected]

Issues Resolved

Task: BABEL-5427, BABEL-5504

Root cause for parallel worker issue:

While we were communicating the logical database name to parallel workers and setting the current_db_name using logical database name (Ref: #2262), we failed to set current_db_id. This omission affected functions like sys.db_id() and sys.db_name(), resulting in empty result sets for queries involving these functions in parallel execution contexts.

Performance Testing

Query:

SELECT 
    tc.TABLE_SCHEMA, 
    tc.TABLE_NAME, 
    kcu.COLUMN_NAME, 
    c.DATA_TYPE 
FROM 
    INFORMATION_SCHEMA.TABLE_CONSTRAINTS tc 
JOIN 
    INFORMATION_SCHEMA.KEY_COLUMN_USAGE kcu 
    ON tc.CONSTRAINT_SCHEMA = kcu.CONSTRAINT_SCHEMA 
    AND tc.CONSTRAINT_NAME = kcu.CONSTRAINT_NAME 
JOIN 
    INFORMATION_SCHEMA.COLUMNS c 
    ON kcu.TABLE_SCHEMA = c.TABLE_SCHEMA 
    AND kcu.TABLE_NAME = c.TABLE_NAME 
    AND kcu.COLUMN_NAME = c.COLUMN_NAME 
WHERE 
    tc.CONSTRAINT_TYPE = 'PRIMARY KEY';

Before : 50187 ms
After: 47285 ms

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is under the terms of the Apache 2.0 and PostgreSQL licenses, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.

For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@coveralls
Copy link
Collaborator

coveralls commented Dec 26, 2024

Pull Request Test Coverage Report for Build 12512252950

Details

  • 3 of 3 (100.0%) changed or added relevant lines in 1 file are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage increased (+0.001%) to 73.801%

Files with Coverage Reduction New Missed Lines %
contrib/babelfishpg_tsql/src/session.c 1 97.4%
Totals Coverage Status
Change from base Build 12502166363: 0.001%
Covered Lines: 43169
Relevant Lines: 58494

💛 - Coveralls

@sumitj824 sumitj824 changed the title Mark INFORMATION_SCHEMA_TSQL.TABLE_CONSTRAINTS_INTERNAL() function as PARALLEL SAFE Mark table_constraints_internal() function as parallel safe and set current_db_id for parallel workers Dec 27, 2024
@shardgupta shardgupta merged commit 709938a into babelfish-for-postgresql:BABEL_3_X_DEV Dec 30, 2024
43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants