This repository has been archived by the owner on Feb 8, 2019. It is now read-only.
[DO NOT MERGE] Refactor type system to provide better extensibility of types and functions #315
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a preliminary PR that is not ready to be merged but provides an overall view of the type system refactoring work. Many constructs are at their initial designs and maybe further improved.
The PR aims at reviewing the refactoring designs at the "architecture" level. Detailed code style and unit test issues may be addressed later in subsequent concrete PRs.
The overall purpose of the refactoring is to improve the extensibility of the existing type/function system (i.e. support more kinds of types/functions and make it easier to add new types and functions), while retaining the performance of the current system.
Major Changes
Part I. Type System
1. Categorize all types into four memory layouts.
The four memory layouts are:
Memory layout decides how the corresponding type's values are stored and represented.
Briefly speaking,
2. Use TypeIDTrait to allow many information to be known at compile time.
With this per-type trait information, we can avoid many boilerplate code for each subclass of Type by using template techniques and specialize on the memory layout. See TypeSynthesizer and TypeFactory.
TypeIDTrait is also extensively used in many other places as it provides all the required compile-time information about a type.
3. Support more types.
Details will be written later about how to add a new type into the Quickstep system.
The current PR has some example types added:
4. Improve the type casting mechanism.
Type casting (coersion) is an important feature that is needed in practice from time to time.
This PR's design defined an overall template
which is then specialized by different source/target types.
The coercibility between two types is then inferred according to whether the corresponding specialization exists. Thus it suffices to just specialize CastFunctor when adding a new casting operation, and all the dependent places (e.g. Type::isCoercibleFrom()) will mostly be auto-generated by the system (unless the target type is a parameterized type and you want to do some further checks).
Note that safe-coercibility is a separate issue and needs to be taken care of mostly manually, by overriding Type::isSafelyCoercibleFrom().
Explicit casting is supported with a PostgreSQL-like syntax. E.g.
(1)
(2)
(3)
NOTE: The work is not yet fully completed so there may be
LOG(FATAL)
aborts for some combinations of queries.Implicit coersion is supported when resolving scalar functions, see here. For example, we have support for the sqrt function where the parameter can be a Float or Double value. Consider the query
where
x
has Int type, then an implicit coercion from Int to Float will be added.5. Add GenericValue to represent typed-values of all four memory layouts.
The original TypedValue is not sufficient to represent CxxGeneric values, as we need to embed the overall Type information in order to handle value allocation/copy/destruction. However, due to performance consideration, we may not just replace TypedValue with a more generic but slower implementation. Thus, a separate GenericValue is added and we still use TypedValue when handling storage-related operations.
6. Move type resolving from parser to resolver.
This avoids the need of modifying SqlParser.ypp for adding a new type.
See ParseDataType and Resolver::resolveDataType().
~
Part II. Scalar Function
1. Implement UnaryOperationSynthesizer/UncheckedUnaryOperatorSynthesizer to make it easier to add unary functions.
Example unary functions:
2. Implement BinaryOperationSynthesizer/UncheckedBinaryOperatorSynthesizer to make it easier to add binary functions.
Example binary functions:
3. Use OperationSignature and OperationFactory to support general operation resolution.
~
Part III. TODOs