Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to reconstruct schema with avrov2 when Serialize() a struct containing interface{} fields #1372

Open
1 of 7 tasks
falau opened this issue Dec 27, 2024 · 0 comments
Open
1 of 7 tasks

Comments

@falau
Copy link

falau commented Dec 27, 2024

Description

hamba/avro provides a codegen tool avrogen to generate a Go struct from an avro schema, and I'm using it to migrate some code away from v1 avro. However, for some unions, avrogen produce fields with any(interface{}) type, and avrov2.Serialize() doesn't handle those fields (without UseSchemaID, UseLatestVersion, or UseLatestWithMetadata).
Specifically, in the following section, the serializer tries to use StructToSchema to reconstruct the avro representation of the Go struct and use it to search on Schema Registry for a schema ID:

if s.Conf.UseSchemaID == -1 &&
!s.Conf.UseLatestVersion &&
len(s.Conf.UseLatestWithMetadata) == 0 {
msgType := reflect.TypeOf(msg)
if msgType.Kind() != reflect.Pointer {
return nil, errors.New("input message must be a pointer")
}
avroSchema, err = StructToSchema(msgType.Elem())
if err != nil {
return nil, err
}
info = schemaregistry.SchemaInfo{
Schema: avroSchema.String(),
}
}

But StructToSchema can't infer the exact type of an interface{} field, which generates by avrogen, and errors out: StructToSchema: unknown type interface.

On the other hand, even if it works, I still have doubts if it can actually reconstruct a schema the is logically equivalent to the original one. For example, if the original schema of a field looks like this

{"name": "foo","type": ["long","int","float","double"]}

By avrogen, it becomes

Foo any `avro:"foo"`

In this case, I couldn't think of a reasonable way for StructToSchema to figure out that the field migh contain more than one type, and this could cause the schema registry client fail to search in the registry?

For now I'm using fixed schema ID configured with the serializer to circumvent this piece of code, but it's pretty inflexible, and definitely defeats the purpose of having a schema registry for producers.

Maybe a similar approach like (v1) avro specific serializer below, allowing passing a canned schema string on a struct to the schema registry client, can be reintroduced?

var avroMsg SpecificAvroMessage
switch t := msg.(type) {
case SpecificAvroMessage:
avroMsg = t
default:
return nil, fmt.Errorf("serialization target must be an avro message. Got '%v'", t)
}
var id int
info := schemaregistry.SchemaInfo{
Schema: avroMsg.Schema(),

avrogen also has an option -encoders to put schemas and some helpers on generated structs.

How to reproduce

	type MyMessage struct {
		Foo any `avro:"foo"`
	}

	srClient, err := schemaregistry.NewClient(schemaregistry.NewConfig("http://example.com"))
	conf := avrov2.NewSerializerConfig() // UseSchemaID == -1 , UseLatestVersion == false ...
	ser, _ := avrov2.NewSerializer(srClient, serde.ValueSerde, conf)
	msg, err := ser.Serialize("foo", &MyMessage{Foo: int64(1234)})
        err -> `StructToSchema: unknown type interface`

Checklist

Please provide the following information:

  • confluent-kafka-go and librdkafka version (LibraryVersion()): v2.6.1 / v2.6.1
  • Apache Kafka broker version
  • Client configuration: ConfigMap{...}
  • Operating system
  • Provide client logs (with "debug": ".." as necessary)
  • Provide broker log excerpts
  • Critical issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant