"Kafka Connect", an open source component of Apache Kafka, is a framework for connecting Kafka with external systems such as databases, key-value stores, search indexes, and file systems.
Kafka Connect is a framework which enables connectors developed by the open source community around Apache Kafka. It allows developers to easily import data from their data sources directly into Kafka, and then take that data from Kafka and then feed it into other systems like Elastic Search.
Adobe Experience Platform weaves all your critical customer data together in real time. This include behavioral, transactional, financial, operational, and third-party data and not just CRM or other first-party data.
Bring in your data, standardize it, make it smarter. See what your customer wants right now and build experiences to match.
Adobe Experience Platform Stream connector is based on Kafka Connect. Use this library to stream JSON events from Kafka topics in your datacenter directly into a Adobe Experience Platform in real-time.
AEP sink connector delivers data from Kafka topics to a registered endpoint of the Adobe Experience Platform.
- Seamlessly ingest events from your Kafka topic to Adobe Experience Platform
- Authenticated collection of data using Adobe's Identity Management Service
- Batching of messages for reduced network calls & higher throughput
If you have your own Kafka Connect Cluster then you just need to drop AEP Streaming Connector which is an Kafka Connect Plugin Uber JAR. Refer documentation on how to install Kafka Connect plugin. Once plugin has been installed in Kafka Connect cluster, you can run streaming connector instances to send data to Adobe. Refer developer guide
If you have your own Kafka deployment but not Kafka Connect Instance, then you can use docker to talk to your Kafka brokers and send data to Adobe.
docker run ghcr.io/adobe/experience-platform-streaming-connect --props connect.bootstrap.servers=<karkabrokers:port>
We have included a quick-start script which automates configuration by creating the following artifacts
- Sample XDM (Experience Data Model) schema
- Dataset to collect streamed records
- Data Collection URL
- Kafka topic on your local machine, along with necessary configuration for the AEP Connector
Following figure illustrates steps simulated by setup script.
brew install jq
./gradlew clean build -x test
docker build -t streaming-connect .
docker-compose up -d
Note: wait for the docker process to start, validate if Kafka connect is running using command below
curl http://localhost:8083/connectors
[]
First, you need to get an API Key and IMS Token for accessing Adobe Cloud Platform APIs. We recommend you start with this tutorial. There's also a super helpful blogpost to better guide you through this process.
docker exec -i experience-platform-streaming-connect_kafka-connect_1 ./setup.sh
The resulting output should look similar to the one below
Enter IMS ORG
<IMS-ORG>
Enter Client ID
***
Enter Client Secret
***
Enter JWT Token
***
Enter Schema Name: [default: Streaming_Connect_Schema_20191014074347]
Making call to create schema to https://platform.adobe.io/ with name Streaming_Connect_Schema_20191014074347
Schema ID: https://ns.adobe.com/<tenant>/schemas/<schema ID>
Enter Dataset Name: [default: Streaming_Ingest_Test_20191014074347]
Making call to create dataset to https://platform.adobe.io/ with name Streaming_Ingest_Test_20191014074347
Data Set: ["@/dataSets/<Dataset ID>"]
Enter Streaming Connection Name: [default: My Streaming Connection-20191014074347]
Enter Streaming Connection Source: [default: My Streaming Source-20191014074347]
Making call to create streaming connection to https://platform.adobe.io/ with name My Streaming Connection-20191014074347 and source My Streaming Source-20191014074347
Streaming Connection: https://dcs.adobedc.net/collection/<Streaming Connection ID>
AEP Sink Connector aep-sink-connector-20191014074347
Enter the number of Experience events to publish
100
Publishing 100 messages for Data set <Dataset ID> and schema https://ns.adobe.com/<tenant>/schemas/<schema ID>
Published 100 messages
The quick-start script will save values for newly created resources like schema and dataset in application.conf making it easier to run test multiple times. Assuming the resources already exist, you have the option of running the data generation script to send data to Adobe Experience Platform.
docker exec -i experience-platform-streaming-connect_kafka-connect_1 ./generate_data.sh <count>
Example: ./generate_data.sh 500
IMS ORG: XYZ@AdobeOrg
Schema ref: https://ns.adobe.com/<tenant>/schemas/090d01896b3cbd72dc7defff1290eb99
Dataset ID: 5d86d1a29ba7e11648cc3afb
Topic Name: connect-test-20190922211238
Publishing 500 messages for Data set 5d86d1a29ba7e11648cc3afb and schema https://ns.adobe.com/<tenant>/schemas/090d01896b3cbd72dc7defff1290eb99
Published 500 messages
Note: To debug logs you may use following command in different terminal
docker logs experience-platform-streaming-connect_kafka-connect_1 -f
To verify your data is landing into platform, login to AEP and follow documentation for monitoring your streaming data flows.
For running experience-platform-streaming-connect locally step-by-step refer Developer Guide
Guidelines:
- If you haven't already, familiarize yourself with the semver specs
- Do not change the major version unless there is a breaking change.
- Bump the minor version when adding new functionality which is backward compatible or enhancing an existing functionality
- Bump the patch version when making changes that are completely isolated from the API.