-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make supported types in dialect yaml files as a map #81
Conversation
@richtia @westonpace can you please review this change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This introduces the concept of a "native typename string" which I feel doesn't really exist in all systems (e.g. those that don't speak SQL like velox? / cuda / pandas / etc.). However, I see how it can be useful.
I kind of follow what is happening here but what are you using these values for? Are you trying to create SQL statements? Are you creating native plans from strings?
dialects/cudf.yaml
Outdated
ts: datetime64[s] | ||
date: datetime64[s] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is datetime64[s]
to cuda?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the mapping here, I thought this would be useful.
what should be the mapping for cuda case? Should we leave the mapping empty for now?
dialects/datafusion.yaml
Outdated
date: timestamp | ||
time: timestamp | ||
ts: timestamp | ||
tstz: timestamp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can these all map to the same type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with these all mapping to timestamp for now since that's what the runner does. I think the other date/time types weren't working properly with the date/time functions, so I added some handling based on timestamp length since timestamp could represent the others. We can open up an issue to fix this and do the proper mapping in a subsequent PR.
Once I have a registry of functions, I want to be able to do two things,
|
bft/dialects/types.py
Outdated
@@ -83,7 +83,7 @@ class DialectFile(NamedTuple): | |||
scalar_functions: List[DialectFunction] | |||
aggregate_functions: List[DialectFunction] | |||
uri_to_func_prefix: Dict[str, str] | |||
supported_types: List[str] | |||
supported_types: Dict[str, str] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that the key is made up of a key and a value itself is its type more like a tuple or dict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed the value to dict
dialects/duckdb.yaml
Outdated
- interval | ||
- dec | ||
i8: | ||
sql_type_name: TINYINT, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the trailing comma needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed it
fp64: | ||
sql_type_name: float64 | ||
bool: | ||
sql_type_name: bool_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the underscore in SQL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied this from runner.py
str: | ||
sql_type_name: text | ||
date: | ||
sql_type_name: date |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these type names lowercase while most of the other dialects are capitalized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed everything to lower case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@scgkiran can you look into the 2 duckdb test failures?
The duckdb failures aren't relevant to this PR. We can make a fix in a separate one. |
This change is needed to build a FunctionRegistry in substrait-go PR