-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create DB Performance improvements #1899
Merged
forestkeeper
merged 5 commits into
babelfish-for-postgresql:BABEL_3_X_DEV
from
tanscorpio7:BABEL_4438
Oct 12, 2023
Merged
Create DB Performance improvements #1899
forestkeeper
merged 5 commits into
babelfish-for-postgresql:BABEL_3_X_DEV
from
tanscorpio7:BABEL_4438
Oct 12, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Tanzeel Khan <[email protected]>
Merged
1 task
tanscorpio7
requested review from
Deepesh125,
HarshLunagariya and
shalinilohia50
October 9, 2023 14:41
Could you please mention how much improvement we have observed after these changes? |
tanscorpio7
changed the title
BABEL_4438 Create DB Performance improvements
Create DB Performance improvements
Oct 9, 2023
Signed-off-by: Tanzeel Khan <[email protected]>
Deepesh125
reviewed
Oct 10, 2023
Signed-off-by: Tanzeel Khan <[email protected]>
Signed-off-by: Tanzeel Khan <[email protected]>
HarshLunagariya
suggested changes
Oct 11, 2023
shalinilohia50
suggested changes
Oct 11, 2023
Signed-off-by: Tanzeel Khan <[email protected]>
shalinilohia50
approved these changes
Oct 11, 2023
forestkeeper
approved these changes
Oct 12, 2023
1 task
ParikshitSarode
pushed a commit
to amazon-aurora/babelfish_extensions
that referenced
this pull request
Oct 17, 2023
In Multi DB mode, as the number of databases increases, so does the time to create the next new DB. This is because we create three internal roles for each new DB and internally when run the DB subcommands, multiple calls to roles_is_member_of("sysadmin") is made. Now that output of this list contains all the three roles of every db created. This is the major reason for the perfomance degradation of CREATE DB command. We fix this in three different places. 1. getAvailDbid - this functions makes a call to nextval function, which by default checks for current user's permission and makes a call to roles_is_member_of. Instead we could call the nextval_internal which is the same function but with the additional option of check permissions flag which we will set to false. To double check we can just ensure that the current user is "sysadmin" when getAvailDbid is called. (Currently we only call this when user is sysadmin) 2. Set temporary user when creating schema - when we create the dbo and guest schema for the new database, the create schema function fetches all the roles that current role is member of (recursively) to check if if current role can actually become the target schema owner role. To bypass this we can assume the newdb_dbo role when creating these schemas. In this case all the roles that newdb_dbo is member of will be fetched, but this list is much smaller than sysadmin. 3. Select best grantor - Select best grantor first fetches the roles_list that sysadmin is member of and then start checking for permissions. But sysadmin is always the first to be checked. That is sysadmin is always top of the roles_list. We can add a quick check to this. That is, first check if current role is sysadmin and can it give us all the permission needed. If yes, simply return. Note** This does not change any behaviour since this will anyway be done in the first loop after fetching roles_list. We are instead running the first loop before fetching the whole list. 4. Set newdb_dbo user when creating sysdatabases view in new db Task: BABEL-3869 Signed-off-by: Tanzeel Khan <[email protected]>
Deepesh125
pushed a commit
to babelfish-for-postgresql/postgresql_modified_for_babelfish
that referenced
this pull request
Oct 24, 2023
In Multi DB mode, as the number of databases increases, so does the time to create the next new DB. This is because we create three internal roles for each new DB and internally when run the DB subcommands, multiple calls to roles_is_member_of("sysadmin") is made. Now that output of this list contains all the three roles of every db created. This is the major reason for the perfomance degradation of CREATE DB command. We fix this in three different places. getAvailDbid - this functions makes a call to nextval function, which by default checks for current user's permission and makes a call to roles_is_member_of. Instead we could call the nextval_internal which is the same function but with the additional option of check permissions flag which we will set to false. To double check we can just ensure that the current user is "sysadmin" when getAvailDbid is called. (Currently we only call this when user is sysadmin) Set temporary user when creating schema - when we create the dbo and guest schema for the new database, the create schema function fetches all the roles that current role is member of (recursively) to check if if current role can actually become the target schema owner role. To bypass this we can assume the newdb_dbo role when creating these schemas. In this case all the roles that newdb_dbo is member of will be fetched, but this list is much smaller than sysadmin. Select best grantor - Select best grantor first fetches the roles_list that sysadmin is member of and then start checking for permissions. But sysadmin is always the first to be checked. That is sysadmin is always top of the roles_list. We can add a quick check to this. That is, first check if current role is sysadmin and can it give us all the permission needed. If yes, simply return. Note** This does not change any behaviour since this will anyway be done in the first loop after fetching roles_list. We are instead running the first loop before fetching the whole list. Engine PR: #234 Extension PR: babelfish-for-postgresql/babelfish_extensions#1899 Extension PR: (cache sysadmin oid) babelfish-for-postgresql/babelfish_extensions#1942 Task: BABEL-4438 Signed-off-by: Tanzeel Khan <[email protected]>
Deepesh125
pushed a commit
that referenced
this pull request
Oct 24, 2023
This is continuation of the original PR #1899. We cache the sysadmin role oid and expose it through a hook for engine code changes Engine PR: babelfish-for-postgresql/postgresql_modified_for_babelfish#234 Extension PR: #1899 Extension PR: (cache sysadmin oid) #1942 Signed-off-by: Tanzeel Khan <[email protected]>
ahmed-shameem
pushed a commit
to amazon-aurora/babelfish_extensions
that referenced
this pull request
Oct 25, 2023
In Multi DB mode, as the number of databases increases, so does the time to create the next new DB. This is because we create three internal roles for each new DB and internally when run the DB subcommands, multiple calls to roles_is_member_of("sysadmin") is made. Now that output of this list contains all the three roles of every db created. This is the major reason for the perfomance degradation of CREATE DB command. We fix this in three different places. 1. getAvailDbid - this functions makes a call to nextval function, which by default checks for current user's permission and makes a call to roles_is_member_of. Instead we could call the nextval_internal which is the same function but with the additional option of check permissions flag which we will set to false. To double check we can just ensure that the current user is "sysadmin" when getAvailDbid is called. (Currently we only call this when user is sysadmin) 2. Set temporary user when creating schema - when we create the dbo and guest schema for the new database, the create schema function fetches all the roles that current role is member of (recursively) to check if if current role can actually become the target schema owner role. To bypass this we can assume the newdb_dbo role when creating these schemas. In this case all the roles that newdb_dbo is member of will be fetched, but this list is much smaller than sysadmin. 3. Select best grantor - Select best grantor first fetches the roles_list that sysadmin is member of and then start checking for permissions. But sysadmin is always the first to be checked. That is sysadmin is always top of the roles_list. We can add a quick check to this. That is, first check if current role is sysadmin and can it give us all the permission needed. If yes, simply return. Note** This does not change any behaviour since this will anyway be done in the first loop after fetching roles_list. We are instead running the first loop before fetching the whole list. 4. Set newdb_dbo user when creating sysdatabases view in new db Task: BABEL-3869 Signed-off-by: Tanzeel Khan <[email protected]>
Sairakan
pushed a commit
to amazon-aurora/postgresql_modified_for_babelfish
that referenced
this pull request
Nov 16, 2023
In Multi DB mode, as the number of databases increases, so does the time to create the next new DB. This is because we create three internal roles for each new DB and internally when run the DB subcommands, multiple calls to roles_is_member_of("sysadmin") is made. Now that output of this list contains all the three roles of every db created. This is the major reason for the perfomance degradation of CREATE DB command. We fix this in three different places. getAvailDbid - this functions makes a call to nextval function, which by default checks for current user's permission and makes a call to roles_is_member_of. Instead we could call the nextval_internal which is the same function but with the additional option of check permissions flag which we will set to false. To double check we can just ensure that the current user is "sysadmin" when getAvailDbid is called. (Currently we only call this when user is sysadmin) Set temporary user when creating schema - when we create the dbo and guest schema for the new database, the create schema function fetches all the roles that current role is member of (recursively) to check if if current role can actually become the target schema owner role. To bypass this we can assume the newdb_dbo role when creating these schemas. In this case all the roles that newdb_dbo is member of will be fetched, but this list is much smaller than sysadmin. Select best grantor - Select best grantor first fetches the roles_list that sysadmin is member of and then start checking for permissions. But sysadmin is always the first to be checked. That is sysadmin is always top of the roles_list. We can add a quick check to this. That is, first check if current role is sysadmin and can it give us all the permission needed. If yes, simply return. Note** This does not change any behaviour since this will anyway be done in the first loop after fetching roles_list. We are instead running the first loop before fetching the whole list. Engine PR: babelfish-for-postgresql#234 Extension PR: babelfish-for-postgresql/babelfish_extensions#1899 Extension PR: (cache sysadmin oid) babelfish-for-postgresql/babelfish_extensions#1942 Task: BABEL-4438 Signed-off-by: Tanzeel Khan <[email protected]>
priyansx
pushed a commit
to amazon-aurora/postgresql_modified_for_babelfish
that referenced
this pull request
Nov 22, 2023
In Multi DB mode, as the number of databases increases, so does the time to create the next new DB. This is because we create three internal roles for each new DB and internally when run the DB subcommands, multiple calls to roles_is_member_of("sysadmin") is made. Now that output of this list contains all the three roles of every db created. This is the major reason for the perfomance degradation of CREATE DB command. We fix this in three different places. getAvailDbid - this functions makes a call to nextval function, which by default checks for current user's permission and makes a call to roles_is_member_of. Instead we could call the nextval_internal which is the same function but with the additional option of check permissions flag which we will set to false. To double check we can just ensure that the current user is "sysadmin" when getAvailDbid is called. (Currently we only call this when user is sysadmin) Set temporary user when creating schema - when we create the dbo and guest schema for the new database, the create schema function fetches all the roles that current role is member of (recursively) to check if if current role can actually become the target schema owner role. To bypass this we can assume the newdb_dbo role when creating these schemas. In this case all the roles that newdb_dbo is member of will be fetched, but this list is much smaller than sysadmin. Select best grantor - Select best grantor first fetches the roles_list that sysadmin is member of and then start checking for permissions. But sysadmin is always the first to be checked. That is sysadmin is always top of the roles_list. We can add a quick check to this. That is, first check if current role is sysadmin and can it give us all the permission needed. If yes, simply return. Note** This does not change any behaviour since this will anyway be done in the first loop after fetching roles_list. We are instead running the first loop before fetching the whole list. Engine PR: babelfish-for-postgresql#234 Extension PR: babelfish-for-postgresql/babelfish_extensions#1899 Extension PR: (cache sysadmin oid) babelfish-for-postgresql/babelfish_extensions#1942 Task: BABEL-4438 Signed-off-by: Tanzeel Khan <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
In Multi DB mode, as the number of databases increases, so does the time to create the next new DB.
This is because we create three internal roles for each new DB and internally when run the DB subcommands, multiple calls to roles_is_member_of("sysadmin") is made. Now that output of this list contains all the three roles of every db created. This is the major reason for the perfomance degradation of CREATE DB command.
We fix this in three different places.
getAvailDbid - this functions makes a call to nextval function, which by default checks for current user's permission and makes a call to roles_is_member_of. Instead we could call the nextval_internal which is the same function but with the additional option of check permissions flag which we will set to false. To double check we can just ensure that the current user is "sysadmin" when getAvailDbid is called. (Currently we only call this when user is sysadmin)
Set temporary user when creating schema - when we create the dbo and guest schema for the new database, the create schema function fetches all the roles that current role is member of (recursively) to check if if current role can actually become the target schema owner role. To bypass this we can assume the newdb_dbo role when creating these schemas. In this case all the roles that newdb_dbo is member of will be fetched, but this list is much smaller than sysadmin.
Select best grantor - Select best grantor first fetches the roles_list that sysadmin is member of and then start checking for permissions. But sysadmin is always the first to be checked. That is sysadmin is always top of the roles_list.
We can add a quick check to this. That is, first check if current role is sysadmin and can it give us all the permission needed. If yes, simply return. Note** This does not change any behaviour since this will anyway be done in the first loop after fetching roles_list. We are instead running the first loop before fetching the whole list.
Set newdb_dbo user when creating sysdatabases view in new db
Engine PR: babelfish-for-postgresql/postgresql_modified_for_babelfish#234
Extension PR: #1899
Issues Resolved
BABEL-3869
Signed-off-by: Tanzeel Khan [email protected]
Check List
By submitting this pull request, I confirm that my contribution is under the terms of the Apache 2.0 and PostgreSQL licenses, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.