From 99aeefea6f2e8c96b873630e2beac0a8445cbb16 Mon Sep 17 00:00:00 2001 From: Ryan Anguiano Date: Mon, 25 Sep 2017 14:00:42 -0400 Subject: [PATCH] Update Readme and config options --- README.md | 136 ++++++++++---------- lib/logstash/codecs/avro_schema_registry.rb | 54 ++++++-- 2 files changed, 105 insertions(+), 85 deletions(-) diff --git a/README.md b/README.md index a75e88d..e649c65 100644 --- a/README.md +++ b/README.md @@ -1,86 +1,80 @@ -# Logstash Plugin +# Logstash Codec - Avro Schema Registry -This is a plugin for [Logstash](https://github.com/elastic/logstash). +This plugin is used to serialize Logstash events as +Avro datums, as well as deserializing Avro datums into +Logstash events. -It is fully free and fully open source. The license is Apache 2.0, meaning you are pretty much free to use it however you want in whatever way. +Decode/encode Avro records as Logstash events using the +associated Avro schema from a Confluent schema registry. +(https://github.com/confluentinc/schema-registry) -## Documentation +## Decoding (input) -Logstash provides infrastructure to automatically generate documentation for this plugin. We use the asciidoc format to write documentation so any comments in the source code will be first converted into asciidoc and then into html. All plugin documentation are placed under one [central location](http://www.elastic.co/guide/en/logstash/current/). +When this codec is used to decode the input, you may pass the following options: +- ``endpoint`` - always required. +- ``username`` - optional. +- ``password`` - optional. -- For formatting code or config example, you can use the asciidoc `[source,ruby]` directive -- For more asciidoc formatting tips, see the excellent reference here https://github.com/elastic/docs#asciidoc-guide +## Encoding (output) -## Need Help? +This codec uses the Confluent schema registry to register a schema and +encode the data in Avro using schema_id lookups. -Need help? Try #logstash on freenode IRC or the https://discuss.elastic.co/c/logstash discussion forum. +When this codec is used to encode, you may pass the following options: +- ``endpoint`` - always required. +- ``username`` - optional. +- ``password`` - optional. +- ``schema_id`` - when provided, no other options are required. +- ``subject_name`` - required when there is no ``schema_id``. +- ``schema_version`` - when provided, the schema will be looked up in the registry. +- ``schema_uri`` - when provided, JSON schema is loaded from URL or file. +- ``schema_string`` - required when there is no ``schema_id``, ``schema_version`` or ``schema_uri`` +- ``check_compatibility`` - will check schema compatibility before encoding. +- ``register_schema`` - will register the JSON schema if it does not exist. +- ``binary_encoded`` - will output the encoded event as a ByteArray. + Requires the ``ByteArraySerializer`` to be set in the Kafka output config. -## Developing +## Usage -### 1. Plugin Developement and Testing +### Basic usage with Kafka input and output. -#### Code -- To get started, you'll need JRuby with the Bundler gem installed. - -- Create a new plugin or clone and existing from the GitHub [logstash-plugins](https://github.com/logstash-plugins) organization. We also provide [example plugins](https://github.com/logstash-plugins?query=example). - -- Install dependencies -```sh -bundle install ``` - -#### Test - -- Update your dependencies - -```sh -bundle install +input { + kafka { + ... + codec => avro_schema_registry { + endpoint => "http://schemas.example.com" + } + } +} +filter { + ... +} +output { + kafka { + ... + codec => avro_schema_registry { + endpoint => "http://schemas.example.com" + subject_name => "my_kafka_subject_name" + schema_uri => "/app/my_kafka_subject.avsc" + register_schema => true + } + } +} ``` -- Run tests +### Binary encoded Kafka output -```sh -bundle exec rspec -``` - -### 2. Running your unpublished Plugin in Logstash - -#### 2.1 Run in a local Logstash clone - -- Edit Logstash `Gemfile` and add the local plugin path, for example: -```ruby -gem "logstash-codec-awesome", :path => "/your/local/logstash-codec-awesome" -``` -- Install plugin -```sh -bin/logstash-plugin install --no-verify -``` -- Run Logstash with your plugin -```sh -bin/logstash -e 'codec {awesome {}}' ``` -At this point any modifications to the plugin code will be applied to this local Logstash setup. After modifying the plugin, simply rerun Logstash. - -#### 2.2 Run in an installed Logstash - -You can use the same **2.1** method to run your plugin in an installed Logstash by editing its `Gemfile` and pointing the `:path` to your local plugin development directory or you can build the gem and install it using: - -- Build your plugin gem -```sh -gem build logstash-codec-awesome.gemspec -``` -- Install the plugin from the Logstash home -```sh -bin/logstash-plugin install /your/local/plugin/logstash-codec-awesome.gem -``` -- Start Logstash and proceed to test the plugin - -## Contributing - -All contributions are welcome: ideas, patches, documentation, bug reports, complaints, and even something you drew up on a napkin. - -Programming is not a required skill. Whatever you've seen about open source and maintainers or community members saying "send patches or die" - you will not see that here. - -It is more important to the community that you are able to contribute. - -For more information about contributing, see the [CONTRIBUTING](https://github.com/elastic/logstash/blob/master/CONTRIBUTING.md) file. +output { + kafka { + ... + codec => avro_schema_registry { + endpoint => "http://schemas.example.com" + schema_id => 47 + binary_encoded => true + } + value_serializer => "org.apache.kafka.common.serialization.ByteArraySerializer" + } +} +``` \ No newline at end of file diff --git a/lib/logstash/codecs/avro_schema_registry.rb b/lib/logstash/codecs/avro_schema_registry.rb index 86b3497..51707be 100644 --- a/lib/logstash/codecs/avro_schema_registry.rb +++ b/lib/logstash/codecs/avro_schema_registry.rb @@ -12,34 +12,42 @@ MAGIC_BYTE = 0 -# Read serialized Avro records as Logstash events +# == Logstash Codec - Avro Schema Registry # # This plugin is used to serialize Logstash events as # Avro datums, as well as deserializing Avro datums into # Logstash events. # +# Decode/encode Avro records as Logstash events using the +# associated Avro schema from a Confluent schema registry. +# (https://github.com/confluentinc/schema-registry) # -# ==== Decoding (input) # -# This codec is for deserializing individual Avro records. It looks up -# the associated avro schema from a Confluent schema registry. -# (https://github.com/confluentinc/schema-registry) +# ==== Decoding (input) # -# When this codec is used on the input, only the ``endpoint`` option is required. +# When this codec is used to decode the input, you may pass the following options: +# - ``endpoint`` - always required. +# - ``username`` - optional. +# - ``password`` - optional. # # ==== Encoding (output) # # This codec uses the Confluent schema registry to register a schema and # encode the data in Avro using schema_id lookups. # -# You can pass several options: -# - ``endpoint`` is always required. -# - If ``schema_id`` is provided, no other options are required. -# - Otherwise, ``subject_name`` is required. -# - If ``schema_version`` is provided, the schema will be looked up in the registry. -# - Otherwise, a JSON schema is loaded from either ``schema_uri``, or ``schema_string`` -# - If ``check_compatibility`` is true, always check compatibility. -# - If ``register_schema`` is true, register the JSON schema if it does not exist. +# When this codec is used to encode, you may pass the following options: +# - ``endpoint`` - always required. +# - ``username`` - optional. +# - ``password`` - optional. +# - ``schema_id`` - when provided, no other options are required. +# - ``subject_name`` - required when there is no ``schema_id``. +# - ``schema_version`` - when provided, the schema will be looked up in the registry. +# - ``schema_uri`` - when provided, JSON schema is loaded from URL or file. +# - ``schema_string`` - required when there is no ``schema_id``, ``schema_version`` or ``schema_uri`` +# - ``check_compatibility`` - will check schema compatibility before encoding. +# - ``register_schema`` - will register the JSON schema if it does not exist. +# - ``binary_encoded`` - will output the encoded event as a ByteArray. +# Requires the ``ByteArraySerializer`` to be set in the Kafka output config. # # ==== Usage # Example usage with Kafka input and output. @@ -69,6 +77,24 @@ # } # } # ---------------------------------- +# +# Binary encoded Kafka output +# +# [source,ruby] +# ---------------------------------- +# output { +# kafka { +# ... +# codec => avro_schema_registry { +# endpoint => "http://schemas.example.com" +# schema_id => 47 +# binary_encoded => true +# } +# value_serializer => "org.apache.kafka.common.serialization.ByteArraySerializer" +# } +# } +# ---------------------------------- + class LogStash::Codecs::AvroSchemaRegistry < LogStash::Codecs::Base config_name "avro_schema_registry"