Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the nvarchar-varbinary casting (#3072) #3358

Open
wants to merge 13 commits into
base: BABEL_5_X_DEV
Choose a base branch
from

Conversation

pranavJ23
Copy link
Contributor

@pranavJ23 pranavJ23 commented Jan 6, 2025

Description

Cherry-pick from: #3072 (reference)
PROBLEM: while casting nvarchar to varbinary we were considering the UTF8 encoding as input encoding in Babelfish where as in TSQL we use UTF16 encoding fir nvarchar irrespective of input encoding.

RCA: we were considering varchar and nvarchar as same, whereas we should use input encoding for varchar and UTF16 encoding for nvarchar.

FIX: So we need to identify that if the input is nvarchar then we will do the UTF16 encoding.

For a casting like nvarchar->varbinary->nvarchar, now since for the casting we are encoding the input string into UTF16 encoding via function nvarcharvarbinary, so while converting varbinary-> nvarchar we will use the function varbinarynvarchar where we will convert UTF16 encoding to UTF8 with null padding.

So we created a function nvarcharvarbinary and varbinarynvarchar to handle nvarchar<-> varbinary to and fro casting. And for this casting we have specifically applied a condition that we will not convert the datatype to basetype before choosing the casting function

Also added the upgrade-test-configuration

Issues Resolved

Task: [BABEL-4891]

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is under the terms of the Apache 2.0 and PostgreSQL licenses, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.

For more information on following Developer Certificate of Origin and signing off your commits, please check here.

pranavJ23 and others added 3 commits January 6, 2025 10:06
PROBLEM: while casting nvarchar to varbinary we were considering the UTF8 encoding as input encoding in Babelfish
where as in TSQL we use UTF16 encoding fir nvarchar irrespective of input encoding.

RCA: we were considering varchar and nvarchar as same, whereas we should use input encoding for varchar and UTF16
encoding for nvarchar.

FIX: So we need to identify that if the input is nvarchar then we will do the UTF16 encoding.

For a casting like nvarchar->varbinary->nvarchar, now since for the casting we are encoding the input string into UTF16
encoding via function nvarcharvarbinary, so while converting varbinary-> nvarchar we will use the function
varbinarynvarchar where we will convert UTF16 encoding to UTF8 with null padding.

So we created a function nvarcharvarbinary and varbinarynvarchar to handle nvarchar<-> varbinary to and fro casting.
And for this casting we have specifically applied a condition that we will not convert the datatype to basetype before choosing the casting function

Task: BABEL-4891
Signed-off-by: Pranav Jain <[email protected]>
Signed-off-by: pranav jain <[email protected]>
@coveralls
Copy link
Collaborator

coveralls commented Jan 6, 2025

Pull Request Test Coverage Report for Build 12670298132

Details

  • 148 of 183 (80.87%) changed or added relevant lines in 3 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.03%) to 74.925%

Changes Missing Coverage Covered Lines Changed/Added Lines %
contrib/babelfishpg_common/src/varbinary.c 83 91 91.21%
contrib/babelfishpg_common/src/varchar.c 40 67 59.7%
Totals Coverage Status
Change from base Build 12667158541: 0.03%
Covered Lines: 46822
Relevant Lines: 62492

💛 - Coveralls

rishabhtanwar29
rishabhtanwar29 previously approved these changes Jan 7, 2025
Signed-off-by: pranav jain <[email protected]>
Comment on lines 10 to 29
CREATE OR REPLACE FUNCTION sys.nvarcharvarbinary(sys.NVARCHAR, integer, boolean)
RETURNS sys.BBF_VARBINARY
AS 'babelfishpg_common', 'nvarcharvarbinary'
LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;

CREATE OR REPLACE FUNCTION sys.varbinarysysnvarchar(sys.BBF_VARBINARY, integer, boolean)
RETURNS sys.NVARCHAR
AS 'babelfishpg_common', 'varbinarynvarchar'
LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;

CREATE OR REPLACE FUNCTION sys.binarysysnvarchar(sys.BBF_BINARY, integer, boolean)
RETURNS sys.NVARCHAR
AS 'babelfishpg_common', 'varbinarynvarchar'
LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;

CREATE OR REPLACE FUNCTION sys.nvarcharbinary(sys.NVARCHAR, integer, boolean)
RETURNS sys.BBF_BINARY
AS 'babelfishpg_common', 'nvarcharbinary'
LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we need this change in 4.4.0--5.0.0 upgrade script ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like cherry pick added the change of 4.4.0--4.5.0 to 4.4.0--5.0.0 automatically, without giving conflicts.
Will change this
Thanks!

Signed-off-by: pranav jain <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants