Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PgQuery::Node: Add inner and inner= helpers to modify inner object #306

Merged
merged 2 commits into from
Jan 9, 2024

Conversation

lfittl
Copy link
Member

@lfittl lfittl commented Jan 3, 2024

Because of how oneof messages work in protobuf, its a bit tedious to make modifications to a Node object, in particular when changing the type of the inner object held within the Node in a generic code path (i.e. with a dynamic inner type), which required first knowing the name of the new inner object type, and then using public_send to set the new value.

To help, introduce the new "inner" and "inner=" methods for PgQuery::Node, that get/set the inner object directly, avoiding the use of public_send.

This will be of particular help for anyone utilizing the walk! API to make modifications to the query tree. To illustrate, an example is added to the treewalker spec showing how to utilize this.


Additionally, this makes utilizing the walk! treewalker easier for the common use case of working with the node directly (without using the parent or location information).

@lfittl lfittl marked this pull request as ready for review January 3, 2024 20:54
@lfittl lfittl requested a review from a team January 3, 2024 20:54
@lfittl lfittl force-pushed the add-more-node-helpers branch from 7fa8700 to cc7d11c Compare January 3, 2024 21:25
@@ -1,22 +1,27 @@
module PgQuery
# Patch the auto-generated generic node type with additional convenience functions
class Node
def self.inner_class_to_name(klass)
@inner_class_to_name ||= descriptor.lookup_oneof('node').map { |f| [f.subtype.msgclass, f.name.to_sym] }.to_h
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://bugs.ruby-lang.org/issues/15143
Can we use to_h with a block here? That was merged in 2018.

Copy link
Member Author

@lfittl lfittl Jan 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - looks like that was added in Ruby 2.6 (see commit), so we could use it.

lfittl added 2 commits January 8, 2024 16:51
Because of how oneof messages work in protobuf, its a bit tedious to
make modifications to a Node object, in particular when changing the type
of the inner object held within the Node in a generic code path (i.e.
with a dynamic inner type), which required first knowing the name of the
new inner object type, and then using public_send to set the new value.

To help, introduce the new "inner" and "inner=" methods for PgQuery::Node,
that get/set the inner object directly, avoiding the use of public_send.

This will be of particular help for anyone utilizing the walk! API to
make modifications to the query tree. To illustrate, an example is added
to the treewalker spec showing how to utilize this.
This simplifies the common use case of just working with the walked node
directly, which is now sufficient more often thanks to the addition of
Node#inner= streamling modification of Node objects without going to the
parent to replace them.
@lfittl lfittl force-pushed the add-more-node-helpers branch from cc7d11c to 3d7dcc4 Compare January 9, 2024 02:02
@lfittl
Copy link
Member Author

lfittl commented Jan 9, 2024

Per request (in a different channel), I've run a quick benchmark showing that this doesn't impact the time the treewalker takes negatively, which it does not:

Before (6 longest SQL statements in fingerprint test file):
Benchmarking time for parse+walk...
                       user     system      total        real
4a6db94fbada8341  52.937340   0.678991  53.616331 ( 53.624522)
b7dbf54ce62af0ca   0.241984   0.002727   0.244711 (  0.244712)
18e71bc17baea13b   0.006921   0.000028   0.006949 (  0.006943)
3a5494404465d0f9   0.014623   0.000034   0.014657 (  0.014658)
1cca3f304295181c   0.009612   0.000013   0.009625 (  0.009623)
f936eab75b8c1b90   0.000537   0.000001   0.000538 (  0.000538)

Benchmarking memory for parse+walk...
Calculating -------------------------------------
    4a6db94fbada8341    13.806B memsize (     0.000  retained)
                        49.838M objects (     0.000  retained)
                        50.000  strings (     0.000  retained)
    b7dbf54ce62af0ca    76.937M memsize (     0.000  retained)
                        88.861k objects (     0.000  retained)
                        50.000  strings (     0.000  retained)
    18e71bc17baea13b     2.268M memsize (     0.000  retained)
                        10.540k objects (     0.000  retained)
                        50.000  strings (     0.000  retained)
    3a5494404465d0f9     4.793M memsize (     0.000  retained)
                        17.635k objects (     0.000  retained)
                        50.000  strings (     0.000  retained)
    1cca3f304295181c     3.159M memsize (     0.000  retained)
                        12.182k objects (     0.000  retained)
                        50.000  strings (     0.000  retained)
    f936eab75b8c1b90   158.198k memsize (     0.000  retained)
                       926.000  objects (     0.000  retained)
                        50.000  strings (     0.000  retained)
After (6 longest SQL statements in fingerprint test file):
Benchmarking time for parse+walk...
                       user     system      total        real
4a6db94fbada8341  50.811444   0.857045  51.668489 ( 51.743262)
b7dbf54ce62af0ca   0.241765   0.002221   0.243986 (  0.244304)
18e71bc17baea13b   0.006827   0.000019   0.006846 (  0.006843)
3a5494404465d0f9   0.014364   0.000009   0.014373 (  0.014371)
1cca3f304295181c   0.009489   0.000013   0.009502 (  0.009501)
f936eab75b8c1b90   0.000513   0.000004   0.000517 (  0.000515)

Benchmarking memory for parse+walk...
    4a6db94fbada8341    13.806B memsize (     0.000  retained)
                        49.838M objects (     0.000  retained)
                        50.000  strings (     0.000  retained)
    b7dbf54ce62af0ca    76.937M memsize (     0.000  retained)
                        88.862k objects (     0.000  retained)
                        50.000  strings (     0.000  retained)
    18e71bc17baea13b     2.268M memsize (     0.000  retained)
                        10.541k objects (     0.000  retained)
                        50.000  strings (     0.000  retained)
    3a5494404465d0f9     4.793M memsize (     0.000  retained)
                        17.636k objects (     0.000  retained)
                        50.000  strings (     0.000  retained)
    1cca3f304295181c     3.159M memsize (     0.000  retained)
                        12.183k objects (     0.000  retained)
                        50.000  strings (     0.000  retained)
    f936eab75b8c1b90   158.278k memsize (     0.000  retained)
                       927.000  objects (     0.000  retained)
                        50.000  strings (     0.000  retained)
Benchmark script:
# spec/benchmark.rb
#
# Needs "s.add_development_dependency 'benchmark-memory', '~> 0'" in the gemspec

require 'pg_query'
require 'json'
require 'benchmark'
require 'benchmark/memory'

def fingerprint_defs
  # Sort input SQL by size (largest first), since those would be most interesting usually
  @fingerprint_defs ||= JSON.parse(File.read(File.join(__dir__, 'files/fingerprint.json'))).sort_by { |d| -d['input'].size }
end

puts 'Benchmarking time for parse+walk...'

Benchmark.bmbm do |x|
  fingerprint_defs.each do |testdef|
    x.report(testdef['expectedHash']) { PgQuery.parse(testdef['input']).walk! { |_, _, _, _| } }
  end
end

puts 'Benchmarking memory for parse+walk...'

Benchmark.memory do |x|
  fingerprint_defs.each do |testdef|
    x.report(testdef['expectedHash']) { PgQuery.parse(testdef['input']).walk! { |_, _, _, _| } }
  end
end

@lfittl lfittl merged commit 8152f95 into main Jan 9, 2024
5 checks passed
@lfittl lfittl deleted the add-more-node-helpers branch January 9, 2024 02:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants