Skip to content
This repository has been archived by the owner on Sep 27, 2019. It is now read-only.

Syntax-Based Query Rewriter, Code Drop #1496

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

newtoncx
Copy link

Contains style fixes and other changes in response to #1495 comments. Our full design doc and review guide can be found here.

17zhangw and others added 16 commits April 1, 2019 18:53
- pattern
- rule
- ruleset
- group
- groupexpression
- binding
- memo
- optimize_context
- optimizer_task (TopDownRewrite/BottomUpRewrite)

Templates generally followed:
template <class Node, class OperatorType, class OperatorExpr>

The template instantiation associated with:
Node = Operator, OperatorType = OpType, OperatorExpr = OperatorExpression
is used primarily by the core Optimizer. All references to the templated
files/classes from core optimizer files were instantiated to that.

Note worth mentioning:
Operator class defines a public interface wrapper around BaseOperatorNode,
basically defines a single logical/physical operator.

OpType class defines the various logical/physical operations
OperatorExpression class is essentially a tree of Operator
Possibly annoying problems w.r.t Peloton/terrier:
(1) Use of unique_ptr/raw pointer as opposed to shared_ptr in AbstractExpression
(2) AbstractExpression equality comparison method

Additional components needed:
- Dynamic/template/strategy rule evaluation (particularly comparison)
- Repeated/multi-level application of rules
- Layer to convert from memo -> AbstractExpression
- Some refactoring w.r.t templated code
- Better AbsExpr_Container/Expression indirection layer
  (intended to present a similar interface exposed by
   Operator/OperatorExpression relied upon by core logic)
- Proper memory management strategy (tightly coupled to problem #1)
What still doesn't work/don't care about yet/not done
- proper memory management (terrier uses shared_ptr anyways)

- other 1-level rewrites, multi-layer rewrites, other expr rewrites

- how can we define a grammar to programmatically create these rewrites?
  (the one we have is way too static...)

- in relation to logical equivalence:
  (1) how do we generate logically equivalent expressions:
      - multi-pass using generating rules (similar to ApplyRule) OR
      - from Pattern A, generate logically equivalent set of patterns P OR
      - transform input expression to match certain specification OR
      - ???
  (2) what operators do we support translating?
      - probably (a AND b) ====> (b AND a)
      - probably (a OR b) ====> (b OR a)
      - probably (a = b) ====> (b = a)
      - maybe more???
  (3) do we want multi level translations?
      - i.e (a AND b) AND c ====> (a AND (b AND c))
      - what order do we do these in?
  May have to modify these operations:
  - Some assertions in TopDownRewrite/BottomUpRewrite w.r.t to the iterator
  - Possibly binding.cpp / optimizer_metadata.h / optimizer_task.cpp

Issues still pending:
- Comparing Values (Matt email/discussion)
- r.rule->Check (terrier issue cmu-db#332)
AbstractNode will provide interface for Operator and eventually
AbstractExpressions as well.

Note there are a few road blocks before the rest of the rewriter can be
changed to cleanly use abstract classes:
(1) Similarly abstract OperatorExpressions.
(2) We will have to find a good place to hide OpType, which is currently
an enum type (cannot be abstracted) and pervades the code base. This may
be solved by abstracting at the group level, but will have to look into
it.
(3) Need to clean up and separate interfaces between AbstractNode,
OperatorNode, and Operator classes.
Abstract nodes were implemented in 209c46a. This is essentially just
refactoring and plugging in abstract nodes throughout the optimizer.
The abstract interface exposes OpType and ExpressionType for now,
which ideally will be fixed later. Work remaining for abstracting
OperatorExpression.
Still need to make fixes to Pattern to support both OpType and ExpType
without templatizing. Will also need to clean up code after before build
will work.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants