Skip to content
This repository has been archived by the owner on Mar 24, 2021. It is now read-only.

Kafka 0.9 Discussion Thread #349

Closed
rduplain opened this issue Nov 12, 2015 · 31 comments
Closed

Kafka 0.9 Discussion Thread #349

rduplain opened this issue Nov 12, 2015 · 31 comments

Comments

@rduplain
Copy link
Contributor

This thread is to discuss Kafka 0.9.0.0 changes in relation to pykafka.

Kafka 0.9.0.0 release candidate 1 is out.
https://people.apache.org/~junrao/kafka-0.9.0.0-candidate1/
https://people.apache.org/~junrao/kafka-0.9.0.0-candidate1/RELEASE_NOTES.html

Voting is in progress:
https://www.mail-archive.com/[email protected]/msg37949.html

Release process:
https://cwiki.apache.org/confluence/display/KAFKA/Release+Process

pykafka status and roadmap as of Nov 20, 2015:
http://pykafka.readthedocs.org/en/latest/roadmap.html

@rduplain
Copy link
Contributor Author

In our initial smoke tests, BalancedConsumer is not compatible, but SimpleConsumer is.

The birding project is able to run against Kafka 0.9.0.0 release candidate 1 with a SimpleConsumer:
https://github.com/Parsely/birding/tree/kafka-0.9

@amontalenti
Copy link
Contributor

Pulling in @emmett9001, @yungchin & @kbourgoin. I did an initial smoke test of 0.9 with a balanced consumer from pykafka, and it failed to consume from an empty or filled topic with pykafka.exceptions.ConsumerStoppedException. The logs for Kafka also showed several errors. I suspect that it's going to be work to make this upgrade go smoothly, due to all the consumer protocol changes. We should make sure to get a branch of pykafka that can work with 0.9 so that we can actually provide feedback on the RC before it formally releases.

@gth828r
Copy link

gth828r commented Nov 13, 2015

Is there any estimated timeline on adding in support for the new consumer API? Is the goal to have something ready around the time that the official Kafka 0.9.0.0 release happens?

@rduplain
Copy link
Contributor Author

@amontalenti and team wrote this roadmap doc which is a great read for all interested in the state of pykafka and plans for the project.

@gth828r started a thread on the mailing list for this topic.

@gth828r
Copy link

gth828r commented Nov 20, 2015

It sounds like there will be a 0.9.0-compatible balanced consumer which is great news! I think that is the main thing we were hoping for on our end anyway, and I'm guessing that should be available sooner than a new consumer API implementation would be.

One thing that I wasn't clear on from the roadmap -- perhaps because it is more speculative -- is the priority of implementing the new consumer API once you are comfortable with it's stability. Since you have a plan for 0.9.0 compatibility, is that something that is going to be punted on for now in terms of planning?

@ijuma
Copy link

ijuma commented Nov 21, 2015

The 0.9.0 broker is meant to support 0.8.2 clients (the old Java consumers and producers still work fine, for example). The version number change from 0.8.3 to 0.9.0.0 is simply due to the large number of improvements in the release, the compatibility guarantee is the same as it would have been for 0.8.3 (0.9.0 clients don't work with 0.8.2 brokers though, see https://issues.apache.org/jira/browse/KAFKA-2845).

@amontalenti
Copy link
Contributor

@ijuma Excellent -- we're going to kick off some tests next week to confirm if pykafka still works with the 0.9.0 RC. Our initial test showed that it failed to consume a message while succeeding on 0.8.2, but it may be something on our side. We'll provide an update when we have some results.

@amontalenti
Copy link
Contributor

@gth828r To answer your question:

is the priority of implementing the new consumer API once you are comfortable with it's stability?

The answer is "yes and no". Our first priority is to make sure that pykafka can work with the 0.9.0 broker and do balanced consuming the same way as it did in 0.8.2 (via the "old" consumer API and Zookeeper).

Once that is stable and working, pykafka will have a single source that works equally well with 0.8.2 and 0.9.0, which is the immediate goal.

We will then begin exploring the new 0.9.0 consumer APIs. What we aren't sure about right now is whether switching pykafka to those APIs will require breaking compatibility with 0.8.2, and how all of that relates to the parallel work going on in the librdkafka community.

@ijuma
Copy link

ijuma commented Nov 21, 2015

Great @amontalenti, sounds good. Geoff wrote some upgrade tests to verify the compatibility: https://issues.apache.org/jira/browse/KAFKA-1888 (maybe useful to check them). We haven't yet deprecated any of the old Java clients (or the protocol used by them) to give users time to migrate.

One more note: the new consumer is marked as beta because this is the first release that includes it. As it happened with the new producer introduced in 0.8.2 (no longer beta as of 0.9.0.0), we would like to give users a chance to test it in production workloads before we remove the beta label. Naturally, the new consumer solves a number of issues present in the old ones (it's why we introduced it) and, in our testing, it's as good or better than the old ones.

I believe librdkafka already supports the new consumer (although maybe not in a released version), but @edenhill would be able to say for sure.

@edenhill
Copy link

The next librdkafka release (0.9.0) will be feature aligned with the new offical Java clients.
This includes:

  • New high-level balanced KafkaConsumer (new API)
  • Client/consumer groups with client-side partition assignment (future new API, currently with two builtin assignors: range and roundrobin - compatible with the new Java consumer)
  • Broker based offset storage (config and new API)
  • SSL support (config)
  • SASL support (config)
  • Broker based quota support (included in stats, new throttle_cb API)
  • Native win32 support

The ones marked with "config" are just new configuration properties, they do not require any new code on your side (e.g., upgrading librdkafka on a host will seamlessly provide SSL and SASL support for existing clients and bindings), while the ones marked with new API will require appropriate integration.

Things are currently in an alpha state and development is ongoing, but if you want to take a preliminary look see the oct15 branch

@emmettbutler emmettbutler added this to the 2.2.0 milestone Nov 24, 2015
@amontalenti
Copy link
Contributor

@emmett9001 @yungchin @rduplain @kbourgoin The Kafka 0.9 release candidate just passed vote, so the latest RC is going to be released officially. Looks like we need to jump on testing it to make sure pykafka nominally works with the existing 0.8.2-era consumer API.

https://groups.google.com/d/msg/kafka-clients/8jsQQctjbH4/KcnHGodUAwAJ

@emmett9001, can you start testing?

Also, thanks @edenhill & @ijuma for those pointers that will assist us with testing things.

@emmettbutler
Copy link
Contributor

Looks like the consumer does fail when running against 0.9.0. I'll have to look into it. The producer seems fine, though.

@yungchin
Copy link
Contributor

With the mini-change currently in #374 I can run our consumer against a 0.9 cluster. The tests pass if you run them individually and tear down the cluster manually inbetween (the problem being that our test cluster tear-down code is incompatible).

@yungchin
Copy link
Contributor

yungchin commented Dec 9, 2015 via email

@Arttii
Copy link

Arttii commented Jan 28, 2016

I was wondering what is the status of integrating the new Consumer API?

@emmettbutler
Copy link
Contributor

@Arttii The new group membership API is being worked on in a pull request. It won't be done for a bit, but it's coming along just fine.

@ibotty
Copy link

ibotty commented Feb 4, 2016

I was wondering whether tls client certs or kerberos are supported. I couldn't find it in the api docs, but honestly did not try it out yet.

@amontalenti
Copy link
Contributor

@ibotty Work on this has begun by @yungchin, but it is not in a usable state just yet.

@edenhill
Copy link

edenhill commented Feb 4, 2016

Technically it should be possible to get it going using the rdkafka backend since both SSL and SASL are purely configuration-based features.
See here:
https://github.com/edenhill/librdkafka/wiki/Using-SSL-with-librdkafka

@amontalenti
Copy link
Contributor

@edenhill Indeed, we noticed that. @yungchin's plan is to first integrate SSL via the rdkafka binding, and then look at the pure Python implementation.

@emmettbutler emmettbutler modified the milestones: 2.2.1, 2.2.0 Feb 5, 2016
@emmettbutler
Copy link
Contributor

Information about the operational aspects of TLS/Kerberos support in kafka is available here

@Ormod
Copy link

Ormod commented Feb 8, 2016

I made a PR for kafka-python to add TLS support, the commit for that can be found at: Ormod/kafka-python@45a0510

Feel free to use the code or ideas from that commit for pykafka as well.

@yungchin
Copy link
Contributor

@Ormod thanks a lot, that's super-generous. I'll certainly get a lot of use out of it.

@emmettbutler
Copy link
Contributor

#416 implementing support for the group membership API is now ready for final review.

@brianbruggeman
Copy link

I am just curious on ETA before I start implementing yet another kafka library in python. In addition, kafka release 0.10, which includes the desirable KafkaStream, is going to be released in Q2 2016. I realize that it's worth time investment to upgrade and with a production environment already in place for Parse.ly, you may be less hesitant to push forward on a new version of kafka, but you also risk losing users by not continuing to sync back with the main project.

@amontalenti
Copy link
Contributor

@brianbruggeman The latest version of pykafka already supports 0.9 and the code in PR #416 adds support for the new 0.9 consumer API, which is now code complete, pending review and production testing.

I would advise against starting your own effort and instead encourage participation in this one, which is happening here in the open.

We are not hesitant to stay current, quite the opposite!

We also have a blog post draft pending review/edits which will advertise pykafka's new 0.9-relevant features to the community. Stay tuned!

@amontalenti
Copy link
Contributor

cc ^ @brianbruggeman

@emmettbutler
Copy link
Contributor

Group membership API support has been merged and released in 2.3.0.

@emmettbutler
Copy link
Contributor

I'm closing this thread since the last core 0.9-related effort is being tracked here cc @yungchin

@yudong2015
Copy link

@amontalenti @yungchin I found pykafka does not support kerberos for now, there is any plan about supporting kerberos?

@emmettbutler
Copy link
Contributor

emmettbutler commented Apr 10, 2018

@yudong2015 There is a ticket open for supporting SASL that we're interested in community contributions on. #651

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests