-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enforce topic -> schema mapping. #221
Comments
@ottomata Not sure I understand how what you're suggesting is different from what we do today since the REST proxy uses the Avro serializers, which will require the schemas to be registered before sending the data? Or are the compatibility checks the schema registry will enforce not strong enough restrictions for you? Would #122 do the trick? i.e. is the real issue that you just don't want auto registration at all, and that you'll handle registration via some other process such as in your build/deployment steps? |
Hmm, I think #122 is important, but that's not what I mean here. I'm talking about limiting the 'subjects' (correct term?) that can be produced to any given topic. Let's say: Subject A has schema ids: 10 We want topic-A to only accept production of schemas in Subject A, and topic-B to only accept production of schemas in Subject B.
This would let you be sure that the schema(-subjects) that you expect to be in a topic are the only subjects there when you consume. You can be sure that some consumer that only consumes from topic-A will only ever get messages in schema Subject A. |
I see. Confluent's serializers just bake this into the scheme used for generating subject names. Producers map the key and value when producing to a topic as Given that you're using JSON, I'm guessing you're looking for the REST proxy to provide this restriction when the serializer does not? Or did you want this to happen via the schema registry somehow? Regardless, I think this would require #220 to start with. Then implementing the equivalent of the Avro serializer's functionality shouldn't be a problem since we'd just integrate it with the JSON serializers (which have been added in #193). |
Yes, but I would like this restriction to be in place for any schema validated data, JSON Schema or Avro whatever future schema format. I just want to ensure that consumers only get schemas that they expect. For Avro, this would be anything within a versioned set of schemas (in a subject?). For JSON Schema, I'm not sure. Since JSON Schema doesn't have built in schema evolution, I'm not sure how it would be versioned, but perhaps that is something consumer's would just have to deal with. You could semantically evolve JSON Schemas, but there would be no built in compatibility support or validation. Anyway...I digress, and am now talking about #220 related stuff. :)
Hm, not sure I follow here. Confluent's serializers...in the REST Proxy? That is, if a schema (value) is auto registered via a produce message, the subject in the Schema registry will be |
@ottomata I believe you could still disable auto registration and then just pass the schema id in your produce requests. Would that help? |
Hi, I'm not following this so much anymore. Wikimedia has decided to not use Confluent products here, as they don't support JSONSchema and building JSONSchema support in to the right places was deemed more difficult than just writing a new service. We're currently working on a nodejs based HTTP POST JSONSchema validation -> Kafka produce service. I'll close this issue, thanks! |
Land here after searching for a way to enforce avro messages in rest proxy e.g. I have a TestTopic1 with schema for it curl -s "http://$rest/topics/TestTopic1" | jq # returns topic info
curl -s "http://$schema/subjects/TestTopic1-value/versions/latest/schema" | jq # returns schema but still I can post json messages to this topic curl -s -X POST "http://$rest/topics/TestTopic1" -H "Content-Type: application/vnd.kafka.json.v2+json" -H "Accept: application/vnd.kafka.v2+json" -d '{"records": [{"value": {"what": "ever"}}]}' | jq which is not desired, now there is message in topic which is not schema compatible at all which might broke consumers which are not aware at moment seems like the fastest possible way is to configure nginx reverse proxy and check content type header to be avro, otherwise reject requests |
It'd be handy if REST proxy could enforce producing only a certain series of a schema to a particular topic. This would require registering a topic to have a particular schema series, perhaps in schema registry.
Thoughts? Is this a bad idea?
The text was updated successfully, but these errors were encountered: