Skip to content

Commit

Permalink
Retain result type after UNION as base type in T-SQL
Browse files Browse the repository at this point in the history
Description
-----------
When a base type such as `money` is UNIONed with another integer type,
the result type will be `fixeddecimal` instead of `money`, which is
inconsistent with SQL Server's behaviour. And could also cause some
characters not displaying well.

Analysis
--------
The root cause of the issue is Postgres will consider the base type of
domain type as the result type. In Babelfish, `money` is implemented
as a domain type of `fixeddecimal`. Actually, `money` is a base type
in T-SQL, thus Babelfish should retain `money` type after UNION.

Fix
---
After a final type has determined by the optimizer, check if it's a
base type in T-SQL. If it is, try to use the domain type as result
type if the domain type has higher precision than other types.

Task: BABEL-5242
Signed-off-by: Bo Li <[email protected]>
  • Loading branch information
hopebo committed Dec 7, 2024
1 parent db4e138 commit 47c6595
Show file tree
Hide file tree
Showing 2 changed files with 74 additions and 20 deletions.
89 changes: 69 additions & 20 deletions src/backend/parser/parse_coerce.c
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@

find_coercion_pathway_hook_type find_coercion_pathway_hook = NULL;
determine_datatype_precedence_hook_type determine_datatype_precedence_hook = NULL;
is_tsql_base_datatype_hook_type is_tsql_base_datatype_hook = NULL;
coerce_string_literal_hook_type coerce_string_literal_hook = NULL;
validate_implicit_conversion_from_string_literal_hook_type validate_implicit_conversion_from_string_literal_hook = NULL;
select_common_type_hook_type select_common_type_hook = NULL;
Expand Down Expand Up @@ -1507,32 +1508,80 @@ select_common_type(ParseState *pstate, List *exprs, const char *context,
}
}

/*
* If all the inputs were UNKNOWN type --- ie, unknown-type literals ---
* then resolve as type TEXT. This situation comes up with constructs
* like SELECT (CASE WHEN foo THEN 'bar' ELSE 'baz' END); SELECT 'foo'
* UNION SELECT 'bar'; It might seem desirable to leave the construct's
* output type as UNKNOWN, but that really doesn't work, because we'd
* probably end up needing a runtime coercion from UNKNOWN to something
* else, and we usually won't have it. We need to coerce the unknown
* literals while they are still literals, so a decision has to be made
* now.
*/
if (ptype == UNKNOWNOID && sql_dialect == SQL_DIALECT_TSQL)

if (sql_dialect == SQL_DIALECT_TSQL)
{
bool all_nullconst = true;
foreach(lc, exprs)
if (ptype == UNKNOWNOID)
{
Node* expr = (Node *) lfirst(lc);
if (exprType(expr) != UNKNOWNOID || !IsA(expr, Const) || !((Const *) expr)->constisnull)
/*
* If all the inputs were UNKNOWN type --- ie, unknown-type
* literals --- then resolve as type TEXT. This situation comes
* up with constructs like SELECT (CASE WHEN foo THEN 'bar' ELSE
* 'baz' END); SELECT 'foo' UNION SELECT 'bar'; It might seem
* desirable to leave the construct's output type as UNKNOWN, but
* that really doesn't work, because we'd probably end up needing
* a runtime coercion from UNKNOWN to something else, and we
* usually won't have it. We need to coerce the unknown literals
* while they are still literals, so a decision has to be made now.
*/
bool all_nullconst = true;
foreach(lc, exprs)
{
all_nullconst = false;
break;
Node* expr = (Node *) lfirst(lc);
if (exprType(expr) != UNKNOWNOID || !IsA(expr, Const) ||
!((Const *) expr)->constisnull)
{
all_nullconst = false;
break;
}
}

if (all_nullconst)
ptype = INT4OID; /* in T-SQL, it is should be decided to INT4 */
}
else if (ptype != exprType(pexpr) && is_tsql_base_datatype_hook &&
(*is_tsql_base_datatype_hook)(exprType(pexpr)))
{
/*
* If the preferred type is not the result type of corresponding
* expression, it means the result type is a domain type in
* Postgres. Some base data types in T-SQL are implemented as
* domain types in Babelfish. From SQL Server's perspective, we
* should try to retain those base types as result types.
*
* But in above process, we only make sure other types can be
* coerced to the base type, which usually has larger range than
* the corresponding domain type. So here we need to double-check
* the domain type also has higher precedence than other
* types.
* For example, tinyint UNION smallint, the base type of
* tinyint is smallint, so they can be coerced mutually. But we
* shouldn't use tinyint as the result type, it could loss
* precision.
*/
Oid targetType = exprType(pexpr);

foreach(lc, exprs)
{
Node *expr = (Node *) lfirst(lc);
Oid type = exprType(expr);

if (type == UNKNOWNOID || type == targetType)
continue;

if (all_nullconst)
ptype = INT4OID; /* in T-SQL, it is should be decided to INT4 */
if (!determine_datatype_precedence_hook ||
!determine_datatype_precedence_hook(targetType,
getBaseType(type)))
break;
}

/*
* Use the base type in T-SQL(implemented as domain type in
* Postgres) as the result type.
*/
if (lc == NULL)
ptype = exprType(pexpr);
}
}

if (ptype == UNKNOWNOID)
Expand Down
5 changes: 5 additions & 0 deletions src/include/parser/parse_coerce.h
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,11 @@ typedef CoercionPathType (*find_coercion_pathway_hook_type) (Oid sourceTypeId,
*/
typedef bool (*determine_datatype_precedence_hook_type) (Oid typeId1, Oid typeId2);

/*
* Hook interface to determine if a data type is a base type in T-SQL
*/
typedef bool (*is_tsql_base_datatype_hook_type) (Oid typeId);

/*
* T-SQL has different rules for string literal datatype coercions
*/
Expand Down

0 comments on commit 47c6595

Please sign in to comment.