Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema Registry and REST Proxy as opt-in folder #102

Merged
merged 37 commits into from
Feb 3, 2018
Merged

Conversation

solsson
Copy link
Contributor

@solsson solsson commented Nov 29, 2017

Replaces #45 with a 3.0+ & k8s 1.8 merge to master.

Upgrades to latest Confluent Platform, 4.0.0.

Note that the current image was built ignoring junit test exception(s): solsson/dockerfiles@209dccb

solsson and others added 30 commits July 29, 2017 14:53
with container configuration, as  Kubernetes creates the env SCHEMA_REGISTRY_PORT
if the service has been created first.
I'm not getting the expected behavior.
Might also be because of low resources on minikube,
but 3.3.0 should only be a few days away.
also the sample is from schema-registry.properties while
kafka-rest had only the stdout appender (and no per-package levels)
both zookeeper and bootstrap addresses.
Was getting connection errors to localhost:9092 broker id -1,
resulting in REST requests never returning.
Use config files and kafka-jre based build for Confluent Platform services
... env added to Landoop's services.
However, I don't want to add these services in production
because I don't understand the license, hence no manifests for them.
@solsson solsson added this to the v3.1 milestone Dec 13, 2017
@solsson
Copy link
Contributor Author

solsson commented Dec 13, 2017

I might have misunderstood some fundamentals, but currently these two tools add little value in our stack. My expectation was that Schema Registry would simplify enforcement, and REST Proxy would lower the barrier for microservice-write-to-kafka, trough the JSON-encoded Avro concept.

Schema lookup at runtime is actually not that valuable. In Java services we want specific deserialization with classes generated from our Avro schemas. This means we have to manage schema files anyway (unless we want to use Schema Registry more like our Docker Registry). Libs like https://www.npmjs.com/package/kafka-avro can benefit from runtime lookup, but we already manage the files so why wait until after build?

REST Proxy fails to lower the barrier for schema-encoded writes, because you must provide either the actual schema or its id (correct me if I'm wrong):

  • If you provide the schema it will override any existing schema for that topic.
  • The ids aren't known at build time (unless, again, we use a central Schema Registry for dev/test/prod).

Might still be useful for cases where a creating a producer or consumer is impractical.
Also, this is a clean opt-in folder and I think it deserves its place in master.

@solsson
Copy link
Contributor Author

solsson commented Dec 15, 2017

@solsson
Copy link
Contributor Author

solsson commented Dec 17, 2017

@solsson
Copy link
Contributor Author

solsson commented Dec 18, 2017

I think my disappointment stems from the naming, that Schema Registry and REST Proxy sounds like generic tools when marketed by the Kafka backing company. In fact they are quite biased tools for quite specific use cases, and based on the (in)activity in issues like json-schema support they will stay that way. These biases and use cases are hard, if not impossible, to spot in the documentation.

For example at Yolean we re-create our development environment many times per day. Schema Registry's id handling is hardly compatible with that. Furthermore, we wanted to use String keys with schema-enforced messages, but REST Proxy requires both to be Avro, for reasons unknown.

I've renamed this feature folder avro-tools. REST Proxy has some uses outside the Avro space, but I don't seen any obvious value. KSQL might be a better integration point.

I recommend these tools if your clients are Java, and you want to use Confluent's kafka-streams-avro-serde. But beware of the tradeoffs: Conversion between byte streams and attributes is usually synchrnous and unit testable. Are you prepared to take it online, i.e. introduce runtime dependencies to which there is asynchronous calls? Such calls look to me like the anti-pattern that event-basted architecture ideas argues against. No amount of caching and battle testing (which I'm sure Confluent has done right) can hide this added complexity.

@solsson
Copy link
Contributor Author

solsson commented Feb 3, 2018

https://www.confluent.io/blog/put-several-event-types-kafka-topic/ is a good read, and hopefully confluentinc/schema-registry#680 will be in next schema registry release.

@solsson solsson merged commit ea1acda into master Feb 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant