-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encode query params in binary instead of text #295
base: master
Are you sure you want to change the base?
Conversation
test_insert_and_read "float", 12.34 | ||
test_insert_and_read "int2", 123i16 | ||
test_insert_and_read "int4", 123i32 | ||
test_insert_and_read "int8", 123i64 | ||
|
||
test_insert_and_read "float4", 12.34f32 | ||
test_insert_and_read "float8", 12.34f64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I don't specify the sizes of numeric types, I get errors that I don't understand yet. I figured I'd look into it later.
Spec errors
1) PG::Driver inserts 123 as int8
insufficient data left in message (PQ::PQError)
from src/pq/connection.cr:214:7 in 'handle_error'
from src/pq/connection.cr:197:7 in 'handle_async_frames'
from src/pq/connection.cr:173:7 in 'read'
from src/pq/connection.cr:168:7 in 'read'
from src/pq/connection.cr:446:31 in 'expect_frame'
from src/pq/connection.cr:445:5 in 'expect_frame'
from src/pg/statement.cr:19:5 in 'perform_query'
from src/pg/statement.cr:35:14 in 'perform_exec'
from lib/db/src/db/statement.cr:93:9 in 'perform_exec_and_release'
from lib/db/src/db/statement.cr:78:7 in 'exec:args'
from lib/db/src/db/pool_statement.cr:19:30 in 'exec:args'
from lib/db/src/db/query_methods.cr:275:7 in 'exec:args'
from spec/pg/encoder_spec.cr:12:5 in '->'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/spec/example.cr:50:13 in 'internal_run'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/spec/example.cr:38:16 in 'run'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/spec/context.cr:20:23 in 'internal_run'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/spec/context.cr:346:14 in 'run'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/spec/context.cr:20:23 in 'internal_run'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/spec/context.cr:155:7 in 'run'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/spec/dsl.cr:237:7 in 'execute_examples'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/spec/dsl.cr:220:13 in '->'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/crystal/at_exit_handlers.cr:14:19 in 'run'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/crystal/main.cr:53:14 in 'exit'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/crystal/main.cr:48:5 in 'main'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/crystal/main.cr:130:3 in 'main'
2) PG::Driver inserts 123 as int2
incorrect binary data format in bind parameter 1 (PQ::PQError)
from src/pq/connection.cr:214:7 in 'handle_error'
from src/pq/connection.cr:197:7 in 'handle_async_frames'
from src/pq/connection.cr:173:7 in 'read'
from src/pq/connection.cr:168:7 in 'read'
from src/pq/connection.cr:446:31 in 'expect_frame'
from src/pq/connection.cr:445:5 in 'expect_frame'
from src/pg/statement.cr:19:5 in 'perform_query'
from src/pg/statement.cr:35:14 in 'perform_exec'
from lib/db/src/db/statement.cr:93:9 in 'perform_exec_and_release'
from lib/db/src/db/statement.cr:78:7 in 'exec:args'
from lib/db/src/db/pool_statement.cr:19:30 in 'exec:args'
from lib/db/src/db/query_methods.cr:275:7 in 'exec:args'
from spec/pg/encoder_spec.cr:12:5 in '->'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/spec/example.cr:50:13 in 'internal_run'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/spec/example.cr:38:16 in 'run'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/spec/context.cr:20:23 in 'internal_run'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/spec/context.cr:346:14 in 'run'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/spec/context.cr:20:23 in 'internal_run'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/spec/context.cr:155:7 in 'run'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/spec/dsl.cr:237:7 in 'execute_examples'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/spec/dsl.cr:220:13 in '->'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/crystal/at_exit_handlers.cr:14:19 in 'run'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/crystal/main.cr:53:14 in 'exit'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/crystal/main.cr:48:5 in 'main'
from /opt/homebrew/Cellar/crystal/1.14.0_1/share/crystal/src/crystal/main.cr:130:3 in 'main'
test_insert_and_read "integer[]", [] of Int32 | ||
test_insert_and_read "integer[]", [1, 2, 3] | ||
test_insert_and_read "integer[]", [[1, 2], [3, 4]] | ||
test_insert_and_read "integer[][]", [[1, 2], [3, 4]] | ||
test_insert_and_read "integer[][][]", [[[1, 2], [3, 4]], [[5, 6], [7, 8]]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was actually surprised that the integer[]
example worked before with a nested array. It still does when encoded as binary, too, but I figured I would make this more explicit.
@@decoders = Hash(Int32, PG::Decoders::Decoder).new(ByteaDecoder.new) | ||
# :nodoc: | ||
class_getter decoders = Hash(Int32, PG::Decoders::Decoder).new(ByteaDecoder.new) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this because I was going to use decoders as the source for oids, but I didn't end up using it. I can probably undo this change.
binary Bytes.empty, -1 | ||
end | ||
|
||
def self.encode(val : Bool, into slice : Bytes = Bytes.new(1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Being able to encode a value into an existing slice is important for arrays and also allowed me to encode things like intervals and geo types in terms of numeric types.
end | ||
|
||
{% for type in %w[Int16 Int32 Int64 Float32 Float64] %} | ||
def self.encode(val : {{type.id}}, into slice : Bytes = Bytes.new(sizeof(typeof(val)))) | ||
IO::ByteFormat::NetworkEndian.encode val, slice |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoever added the ability encode numeric values directly into slices without an intermediate IO
gets the world's biggest high-five from me.
private OID_MAP = { | ||
Bool.name => 16, # boolean | ||
Bytes.name => 17, # bytea | ||
Char.name => 18, # char | ||
Int16.name => 21, # int2 | ||
Int32.name => 23, # int4 | ||
Int64.name => 20, # int8 | ||
String.name => 25, # text | ||
Float32.name => 700, # float4 | ||
Float64.name => 701, # float8 | ||
UUID.name => 2950, # uuid | ||
PG::Geo::Point.name => 600, # point | ||
PG::Geo::Path.name => 602, # path | ||
PG::Geo::Polygon.name => 604, # polygon | ||
PG::Geo::Box.name => 603, # box | ||
PG::Geo::LineSegment.name => 601, # lseg | ||
PG::Geo::Line.name => 628, # line | ||
PG::Geo::Circle.name => 718, # circle | ||
JSON::Any.name => 3802, # jsonb | ||
Time.name => 1184, # timestamptz | ||
Time::Span.name => 1186, # interval | ||
PG::Interval.name => 1186, # interval | ||
} of String => Int32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what I was hoping to avoid when making decoders accessible from the PQ
namespace. I used type names as keys rather than the types themselves because I wanted to make it extensible, but I don't know how realistic that is.
I think that, in order to do this extensibly, it would require writing both encoders and decoders, because on top of the OID mapping, it also has to know the size of a given value to encode it, and I ended up making a SIZE_MAP
below. I don't like that three different data structures have to know about the types available.
This is part of why I don't know whether this PR was a good idea. I may have bitten off more than I had the motivation to chew here and I'm not sure it's worth restructuring the shard to support binary encoding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yeah I see what you're saying but also 44% less heap can make a lot of things worth it though
This file tests functionality that no lonfer applies
I like this in general. Not only for the less allocations on the crystal side, but I would have to imagine that postgres not having to parse as much text on the database side can be significant with write heavy workloads. |
That's a solid point. Scaling Postgres (especially for writes) is far more challenging than scaling out app servers when you hit a CPU threshold in Crystal app code. |
Been messing with this the past couple days. I'm not sure yet, but this may actually not be a good idea. Encoding values as binary doesn't seem to get the benefit of autocasting, such as from
int4
up toint8
. Either that or I've screwed something up.I'll annotate some of the weirdness I saw in review comments. Maybe someone knows how to fix/work around those things.
It did cut down pretty significantly on the amount of heap memory allocated per query with bind args, though. It was a lot more than I expected for a simple query — about 44% fewer bytes allocated across the complete query execution. There are still some places I want to optimize heap allocations (like in serializing array types), but that was pretty nice to see.
Benchmark code
On the
master
branch:With this PR:
Fixes #294