Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] support casting between pgnodes #1048

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

skyzh
Copy link
Contributor

@skyzh skyzh commented Feb 16, 2023

No description provided.

@skyzh
Copy link
Contributor Author

skyzh commented Feb 16, 2023

Some things I'm not sure:

  • Should we use From from std library for the casting? It seems that casting is unsafe and we should mark fn cast_from and fn try_cast_from as unsafe function.
  • I have to manually block some nodes which don't have a corresponding node tag. Probably there's something wrong...

@eeeebbbbrrrr
Copy link
Contributor

Do we need both cast_from and try_cast_from? Seems like since this kind of thing requires some in-depth Postgres knowledge we'd only want try_cast_from and make it unsafe?

And should that return a Result instead of an Option?

Regarding the the blocklist you made, we identify those as a PgNode b/c they have pub type_: NodeTag. I suppose this is a little different. Is there a pattern you can use to programmatically determine if a type is compatible with this casting framework? This is one of those things we'd have to manually maintain. I suppose the compiler will tell us when we support a new PG version and a new type comes to life, but... I'd rather us not have to worry about it at all.

@skyzh skyzh changed the base branch from master to develop February 16, 2023 21:30
@skyzh
Copy link
Contributor Author

skyzh commented Feb 17, 2023

Thanks for the suggestion! I'm thinking of doing the code generation in another way: for PgCast, we should traverse on NodeTag enum and only implement it on nodes with a corresponding node tag.

For the current blocklist,

         "HeapTupleTableSlot",
         "VirtualTupleTableSlot",
         "PartitionPruneStep",
         "BufferHeapTupleTableSlot",
         "MinimalTupleTableSlot",

These are all subclass of TupleTable but doesn't have a corresponding node tag. Their casting is done by looking into uint16 tts_flags of TupleTableSlot class.

JoinPath and Expr are the parent class of join paths and expressions respectively, which also doesn't have a node tag.

MemoryContextData starts with a node tag but it doesn't seem to correspond to any object.

We can either use a blocklist or change implement PgCast only on those with a corresponding node tag. In this way we can do the casting code generation programmatically.


Also the current implementation might be counter-intuitive in some cases. For example, T_Plan is a node type, which has many subclasses like ProjectSet, ModifyTable, etc. With the current cast implementation, we will return an error if a user casts ProjectSet to Plan because the node tag is T_ProjectSet instead of T_Plan.

Therefore, besides the PgCast trait, we might also need to introduce trait PgPlanNode, trait PgExpr, trait PgJoinPath, etc. We can discover these structs by checking if the first element in the struct is Plan, Expr or JoinPath respectively, probably in future PRs.

@workingjubilee
Copy link
Member

workingjubilee commented Sep 21, 2023

MemoryContextData starts with a node tag but it doesn't seem to correspond to any object.

It corresponds with T_AllocSetContext | T_SlabContext | T_GenerationContext. A given NodeTagged type can have multiple valid tags for it. I believe using a constant as a marker, thus, is not a correct approach. I did use it for trait Enlist, but that's because I reified the Rust type: List can have multiple tags, but List<T> allows describing the behavior associated with each tag because it is monomorphic on T.

Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably like us to have this but it needs some clarity of purpose.

// impl the PgNode trait for all nodes
pgnode_impls.extend(quote! {
impl pg_sys::seal::Sealed for #struct_name {}
impl pg_sys::PgNode for #struct_name {}
impl pg_sys::PgNode for #struct_name {
const NODE_TAG: Option<NodeTag> = #node_tag;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't describe polymorphic types with a monomorphic tag like this.

Comment on lines +289 to +291
fn cast_from(pg_node: &A) -> Self {
Self::try_cast_from(pg_node).unwrap()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Eric, we should just make it always-fallible and ditch this.

Comment on lines +296 to +308
unsafe {
let node = std::mem::transmute::<_, &Node>(pg_node);
match B::NODE_TAG {
Some(tag) => {
if node.type_ == tag {
Some(std::mem::transmute::<_, &B>(node))
} else {
None
}
}
None => None,
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrmmm. What "directionality" of casting is this intended to handle? Upcasting? Downcasting? Something else? I feel we should support checked casting, for sure, instead of just making people do pointer-casts, but I don't feel confident about what this is "for", basically.

// now we can finally iterate the Nodes and emit out Display impl
for node_struct in nodes {
let struct_name = &node_struct.struct_.ident;

let node_tag = if !block_list.contains(struct_name.to_string().as_str()) {
let node_tag = quote::format_ident!("NodeTag_T_{}", struct_name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I changed NodeTag to a real enum, so this would have to be

Suggested change
let node_tag = quote::format_ident!("NodeTag_T_{}", struct_name);
let node_tag = quote::format_ident!("NodeTag_T::{}", struct_name);

// Though these structs looks like a node, they don't seem to have a corresponding [`NodeTag`].
// Also for `Node`, it doesn't have a node tag.
let block_list = [
"Node",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why block Node, exactly?

@workingjubilee
Copy link
Member

The way forward is probably having each Node type implement a trait that has a function that can be called to determine if the cast is valid.

@workingjubilee
Copy link
Member

MemoryContextData starts with a node tag but it doesn't seem to correspond to any object.

It corresponds with T_AllocSetContext | T_SlabContext | T_GenerationContext. A given NodeTagged type can have multiple valid tags for it. I believe using a constant as a marker, thus, is not a correct approach. I did use it for trait Enlist, but that's because I reified the Rust type: List can have multiple tags, but List<T> allows describing the behavior associated with each tag because it is monomorphic on T.

This is slightly off, my apologies. It's more like an abstract Node type, which means that other Nodes can have it as their superclass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants