Enhancement: support custom serializers for native types #3914

umarbutler · 2024-12-27T02:11:05Z

Summary

From my understanding, if we define a custom encoder or decoder in type_encoders or type_decoders for a native type like list, it won't be used (I assume so because it seems like Litestar is built on msgspec which has that same behaviour and furthermore I have only noticed performance gains when outputting a function to bytes after serialization with my custom serializer instead of outputting a native type and defining a custom serializer).

I would like to suggest that functionality be added to support the ability to override the use of msgspec for particular native types, without having to output to bytes or accept bytes as inputs.

My reasoning is that for very high-performance use cases (eg, using Litestar for IPC (you'd be surpised to learn that Litestar can in fact be faster than gRPC in Python, probably because Python gRPC is often regarded as being less optimised than its counterparts in other languages)), dispatching a simple list -> bytes conversion to ormsgpack or orjson instead of msgspec can improve speed.

msgspec is quite fast to be sure, but it's not the fastest all round.

For decoding a simple list of two floats, ormsgpack can shave >10ns off a 96ns decode:

import msgspec
import ormsgpack

msgpack_encoder = msgspec.msgpack.encode

msgspec_msgpack_decoder = msgspec.msgpack.decode
msgspec_reused_decoder = msgspec.msgpack.Decoder().decode
msgspec_typed_msgpack_output_decoder = msgspec.msgpack.Decoder(list).decode
ormsgpack_msgpack_decoder = ormsgpack.unpackb

output = [0.028076171875, 0.002166748046875]
output_msgpack = msgpack_encoder(output)

%timeit msgspec_msgpack_decoder(output_msgpack) # 97.7 ns ± 1.77 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
%timeit msgspec_reused_decoder(output_msgpack) # 98.8 ns ± 0.822 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
%timeit msgspec_typed_msgpack_output_decoder(output_msgpack) # 96 ns ± 0.634 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
%timeit ormsgpack_msgpack_decoder(output_msgpack) # 86.6 ns ± 3.21 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

I could just do the serialization and deserialization in each endpoint function and output straight to bytes but as mentioned earlier, I'd prefer to be able to keep my type hints as semantic as possible, also my understanding is that the type hints will inform the final OpenAPI output? Feel free to correct me if I'm wrong.

Basic Example

No response

Drawbacks and Impact

No response

Unresolved questions

No response

The text was updated successfully, but these errors were encountered:

umarbutler added the Enhancement This is a new feature or request label Dec 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancement: support custom serializers for native types #3914

Enhancement: support custom serializers for native types #3914

umarbutler commented Dec 27, 2024

Enhancement: support custom serializers for native types #3914

Enhancement: support custom serializers for native types #3914

Comments

umarbutler commented Dec 27, 2024

Summary

Basic Example

Drawbacks and Impact

Unresolved questions