Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache sysadmin oid #1942

Merged

Conversation

tanscorpio7
Copy link
Contributor

@tanscorpio7 tanscorpio7 commented Oct 21, 2023

Description

Continuation of the original PR #1899

We cache the sysadmin role oid and expose it through a hook for engine code changes

Engine PR: babelfish-for-postgresql/postgresql_modified_for_babelfish#234

Extension PR: #1899

Extension PR: (cache sysadmin oid) #1942

Signed-off-by: Tanzeel Khan [email protected]

Issues Resolved

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is under the terms of the Apache 2.0 and PostgreSQL licenses, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.

For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Tanzeel Khan <[email protected]>
Oid get_sysadmin_oid(void)
{
if (!OidIsValid(sysadmin_oid))
sysadmin_oid = get_role_oid("sysadmin", true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question, will sysadmin change during runtime ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once created, OID will not change.

Deepesh125 pushed a commit to babelfish-for-postgresql/postgresql_modified_for_babelfish that referenced this pull request Oct 24, 2023
In Multi DB mode, as the number of databases increases, so does the time to create the next new DB.

This is because we create three internal roles for each new DB and internally when run the DB subcommands, multiple
calls to roles_is_member_of("sysadmin") is made. Now that output of this list contains all the three roles of every db
created. This is the major reason for the perfomance degradation of CREATE DB command.

We fix this in three different places.

getAvailDbid - this functions makes a call to nextval function, which by default checks for current user's permission and
makes a call to roles_is_member_of. Instead we could call the nextval_internal which is the same function but with the
additional option of check permissions flag which we will set to false. To double check we can just ensure that the
current user is "sysadmin" when getAvailDbid is called. (Currently we only call this when user is sysadmin)

Set temporary user when creating schema - when we create the dbo and guest schema for the new database, the
create schema function fetches all the roles that current role is member of (recursively) to check if if current role can
actually become the target schema owner role. To bypass this we can assume the newdb_dbo role when creating these
schemas. In this case all the roles that newdb_dbo is member of will be fetched, but this list is much smaller than
sysadmin.

Select best grantor - Select best grantor first fetches the roles_list that sysadmin is member of and then start checking
for permissions. But sysadmin is always the first to be checked. That is sysadmin is always top of the roles_list.
We can add a quick check to this. That is, first check if current role is sysadmin and can it give us all the permission
needed. If yes, simply return. Note** This does not change any behaviour since this will anyway be done in the first loop
after fetching roles_list. We are instead running the first loop before fetching the whole list.

Engine PR: #234
Extension PR: babelfish-for-postgresql/babelfish_extensions#1899
Extension PR: (cache sysadmin oid) babelfish-for-postgresql/babelfish_extensions#1942

Task: BABEL-4438
Signed-off-by: Tanzeel Khan <[email protected]>
@Deepesh125 Deepesh125 merged commit 8739a1e into babelfish-for-postgresql:BABEL_3_X_DEV Oct 24, 2023
27 checks passed
Sairakan pushed a commit to amazon-aurora/postgresql_modified_for_babelfish that referenced this pull request Nov 16, 2023
In Multi DB mode, as the number of databases increases, so does the time to create the next new DB.

This is because we create three internal roles for each new DB and internally when run the DB subcommands, multiple
calls to roles_is_member_of("sysadmin") is made. Now that output of this list contains all the three roles of every db
created. This is the major reason for the perfomance degradation of CREATE DB command.

We fix this in three different places.

getAvailDbid - this functions makes a call to nextval function, which by default checks for current user's permission and
makes a call to roles_is_member_of. Instead we could call the nextval_internal which is the same function but with the
additional option of check permissions flag which we will set to false. To double check we can just ensure that the
current user is "sysadmin" when getAvailDbid is called. (Currently we only call this when user is sysadmin)

Set temporary user when creating schema - when we create the dbo and guest schema for the new database, the
create schema function fetches all the roles that current role is member of (recursively) to check if if current role can
actually become the target schema owner role. To bypass this we can assume the newdb_dbo role when creating these
schemas. In this case all the roles that newdb_dbo is member of will be fetched, but this list is much smaller than
sysadmin.

Select best grantor - Select best grantor first fetches the roles_list that sysadmin is member of and then start checking
for permissions. But sysadmin is always the first to be checked. That is sysadmin is always top of the roles_list.
We can add a quick check to this. That is, first check if current role is sysadmin and can it give us all the permission
needed. If yes, simply return. Note** This does not change any behaviour since this will anyway be done in the first loop
after fetching roles_list. We are instead running the first loop before fetching the whole list.

Engine PR: babelfish-for-postgresql#234
Extension PR: babelfish-for-postgresql/babelfish_extensions#1899
Extension PR: (cache sysadmin oid) babelfish-for-postgresql/babelfish_extensions#1942

Task: BABEL-4438
Signed-off-by: Tanzeel Khan <[email protected]>
priyansx pushed a commit to amazon-aurora/postgresql_modified_for_babelfish that referenced this pull request Nov 22, 2023
In Multi DB mode, as the number of databases increases, so does the time to create the next new DB.

This is because we create three internal roles for each new DB and internally when run the DB subcommands, multiple
calls to roles_is_member_of("sysadmin") is made. Now that output of this list contains all the three roles of every db
created. This is the major reason for the perfomance degradation of CREATE DB command.

We fix this in three different places.

getAvailDbid - this functions makes a call to nextval function, which by default checks for current user's permission and
makes a call to roles_is_member_of. Instead we could call the nextval_internal which is the same function but with the
additional option of check permissions flag which we will set to false. To double check we can just ensure that the
current user is "sysadmin" when getAvailDbid is called. (Currently we only call this when user is sysadmin)

Set temporary user when creating schema - when we create the dbo and guest schema for the new database, the
create schema function fetches all the roles that current role is member of (recursively) to check if if current role can
actually become the target schema owner role. To bypass this we can assume the newdb_dbo role when creating these
schemas. In this case all the roles that newdb_dbo is member of will be fetched, but this list is much smaller than
sysadmin.

Select best grantor - Select best grantor first fetches the roles_list that sysadmin is member of and then start checking
for permissions. But sysadmin is always the first to be checked. That is sysadmin is always top of the roles_list.
We can add a quick check to this. That is, first check if current role is sysadmin and can it give us all the permission
needed. If yes, simply return. Note** This does not change any behaviour since this will anyway be done in the first loop
after fetching roles_list. We are instead running the first loop before fetching the whole list.

Engine PR: babelfish-for-postgresql#234
Extension PR: babelfish-for-postgresql/babelfish_extensions#1899
Extension PR: (cache sysadmin oid) babelfish-for-postgresql/babelfish_extensions#1942

Task: BABEL-4438
Signed-off-by: Tanzeel Khan <[email protected]>
@tanscorpio7 tanscorpio7 deleted the BABEL_4438 branch December 12, 2023 08:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants