Skip to content

raft-tech/kafka-summit-2021

Repository files navigation

Kafka Summit 2021: Securing the Message Bus with Kafka Streams

Overview

Organizations have a need to protect Personally Identifiable Information (PII). As Event Streaming Architecture (ESA) becomes ubiquitous in the enterprise, the prevalence of PII within data streams will only increase. Data architects must be cognizant of how their data pipelines can allow for potential leaks. In highly distributed systems, zero-trust networking has become an industry best practice. We can do the same with Kafka by introducing message-level security.

A DevSecOps Engineer with some Kafka experience can leverage Kafka Streams to protect PII by enforcing role-based access control using Open Policy Agent. Rather than implementing a REST API to handle message-level security, Kafka Streams can filter, or even transform outgoing messages in order to redact PII data while leveraging the native capabilities of Kafka.

In our proposed presentation, we will provide a live demonstration that consists of two consumers subscribing to the same Kafka topic, but receiving different messages based on the rules specified in Open Policy Agent. At the conclusion of the presentation, we will provide attendees with a GitHub repository, so that they can enjoy a sandbox environment for hands-on experimentation with message-level security.

Objective

The goal is to demonstrate how we can utilize Open Policy Agent to provide fine-grained access-control for Kafka topics and the Kafka Streams API to filter outgoing PII messages based on the end user. The demo will show how three different Kafka users (bobjones, alicesmith, johnhernandez) will authenticate to the Kafka broker and receive different results while consuming from the same topic (pii).

Technical Stack

  • Confluent v5.5.2 (ZooKeeper, Broker, Control Center)
  • Open Policy Agent
  • Python 3.8.11
  • Java 11
  • Maven 3.8.1

Running via Docker-Compose

This demo is bootstrapped using the docker-compose.yaml file which starts up the following Docker containers:

  • Confluent (ZooKeeper, Broker, Control Center)
  • Open Policy Agent
  • PII fake data generator using Python
  • Kafka Streams for each of the three Kafka SASL users: bobjones, alicesmith, johnhernandez

To get started, run the following command:

# Prepend 'sudo' if required in your environment
docker-compose up -d

Please note that it is expected for the kafka-streams-johnhernandez to show an exit 0. This is because in the demo.rego, johnhernandez was not given access to any of the topics. Read the section labeled Result for kafka-streams-johnhernandez for more information.

Example output when running docker-compose ps: docker-compose ps

Examining the Logs

To examine the messages received for each of the three Kafka users, run the following command (Be sure to replace the container_name placeholder):

docker logs -f [container_name]

Result for kafka-streams-bobjones

docker-compose logs -f kafka-streams-bobjones As expected, when connecting as bobjones, we will only be able to view the messages pertaining to him. The same will go for the other users.

Result for kafka-streams-johnhernandez

docker-compose logs -f kafka-streams-johnhernandez As mentioned before, since we did not grant johnhernandez access to any of the topics via OPA, we will receive a TopicAuthorizationException when attempting to consume messages.

Kafka and Open Policy Agent (OPA)

In order for Kafka to integrate with Open Policy Agent, we created a derivative Docker image based on the confluentinc/cp-server:5.5.2 image. We utilized this GitHub repository which provides the custom authorization code that connects Kafka and OPA. From here, we can provide the following properties to the Kafka broker during startup:

# The Java jar path to the OPA class that handles authorization
KAFKA_AUTHORIZER_CLASS_NAME: tech.goraft.kafka.opa.OpaAuthorizer
# The OPA endpoint for handling authorization
KAFKA_OPA_AUTHORIZER_URL: "http://opa:8181/v1/data/kafka/authz/allow"
KAFKA_OPA_AUTHORIZER_ALLOW_ON_ERROR: "false"
KAFKA_OPA_AUTHORIZER_CACHE_INITIAL_CAPACITY: 100
KAFKA_OPA_AUTHORIZER_CACHE_MAXIMUM_SIZE: 100
KAFKA_OPA_AUTHORIZER_CACHE_EXPIRE_AFTER_MS: 600000

OPA's Rego Query Language

The Rego query language provides a declaritive way to define policies. For this demo, a single Rego file was created to handle the access-control for the various Kafka users. After a user authenticates to the Kafka broker, the defined OPA policies are checked to see if the user has access to the requested topic.

How to Build/Run the Individual Applications

If you want to go beyond what is provided in the demo, the following sections will describe how to build and push the applications individually. Please verify the software defined in the Technical Stack section is properly installed on your machine.

Note: If you want to test your new image with the original demo, be sure to replace the appropriate image in the docker-compose.yaml.

Kafka-OPA

  1. Change directory to kafka-opa:
    cd kafka-opa
  2. Build the Java application using Maven
    mvn clean install
  3. Build the Docker image
    docker build . -t [registry_name]/kafka-opa
  4. Push the Docker image
    docker push [registry_name]/kafka-opa

PII-Datagen

  1. Change directory to pii-datagen:
    cd pii-datagen
  2. Build the Docker image
    docker build . -t [registry_name]/pii-datagen
  3. Push the Docker image
    docker push [registry_name]/pii-datagen

Kafka-Streams-Message-Security

  1. Change directory to kafka-streams-message-security:
    cd kafka-streams-message-security
  2. Build the Docker image
    docker build . -t [registry_name]/kafka-streams-message-security
  3. Push the Docker image
    docker push [registry_name]/kafka-streams-message-security

References