Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP Sink MVP #1013

Merged
merged 6 commits into from
Feb 13, 2024
Merged

HTTP Sink MVP #1013

merged 6 commits into from
Feb 13, 2024

Conversation

davidsloan
Copy link
Collaborator

@davidsloan davidsloan commented Dec 14, 2023

Work to date on a HTTP sink.

Not yet covered:

  • deletion (via tombstones)
  • tests with tls certificates

This is a work in progress and should not be used in a production environment.

Configuration
For speed of development the configuration is provided via a Kafka Connect configuration property (connect.http.config), through the means of a Json template. Here is an example:

{
   "authentication":{
      "username":"user",
      "password":"pass",
      "type":"BasicAuthentication"
   },
   "method":"Put",
   "endpoint":"http://{{clientSubDomain}}.example.com",
   "content":"<cost><costCentre>{{clientCostCentre}}</costCentre><spender>{{employee}}</spender><description>{{costDescription}}</description></body></cost>",
   "headers":[
      [
         "Content-Type",
         "application/xml"
      ],
      [
         "X-Client-Cost-Centre",
         "{{clientCostCentre}}"
      ]
   ]
}

There was a discussion about using KCQL however it was decided that this is not suitable for this use case.

Templating

Two different types of templates can be provided:

XML is used here as an example, in fact any markup language or text content can be supported.

Single Template

For when you want to substitute the contents of a single message into the template content. It could be used for multiple messages but this wouldn't make much sense as there is no looping part.

      <xml>
          <message>
            <topic>{{topic}}</topic>
            <employee>{{value.employeeId}}</employee>
            <order>{{value.orderNo}}</order>
            <groupDomain>{{value.groupDomain}}</groupDomain>
          </message>
      </xml>

Multi Template

For when you want to merge multiple messages inside a single HTTP request. It loops through every message.

      <xml>
        <messages>
          {{#message}}
          <message>
            <topic>{{topic}}</topic>
            <employee>{{value.employeeId}}</employee>
            <order>{{value.orderNo}}</order>
            <groupDomain>{{value.groupDomain}}</groupDomain>
          </message>
          {{/message}}
        </messages>
      </xml>

@davidsloan davidsloan force-pushed the feat/http-sink branch 3 times, most recently from 5939a47 to a46b520 Compare December 14, 2023 16:52

class KafkaConnectKeyBinding(name: Option[String]) extends KafkaConnectBaseBinding() {
override def get(sinkRecord: SinkRecord): AnyRef = KafkaConnectExtractor.extractFromKey(sinkRecord, name).leftMap(e =>
throw new ConnectException(s"unable to extract field $name for template, ", e),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a nit pick :)
I would simply pattern match on the result of extractFromKey and throw on the left branch. It's a bit more explicit behavior wise

@davidsloan davidsloan force-pushed the feat/http-sink branch 4 times, most recently from 215bb61 to dea8b29 Compare January 4, 2024 14:39
@davidsloan davidsloan changed the title HTTP Sink WTD HTTP Sink MVP Feb 9, 2024
* Regex templating
* Commit Policies
* Integration testing
* Fix logging
* Error handling
* Add configuration for error threshold and upload sync period
@davidsloan davidsloan marked this pull request as ready for review February 9, 2024 11:27
@@ -246,6 +246,28 @@ lazy val elastic7 = (project in file("kafka-connect-elastic7"))
.configureFunctionalTests()
.enablePlugins(PackPlugin)

lazy val http = (project in file("kafka-connect-http"))
.dependsOn(common)
//.dependsOn(`test-common` % "fun->compile")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the functional tests are to be added later? Do we need that line then?

.configureAssembly(false)
.configureTests(baseTestDeps ++ kafkaConnectHttpTestDeps)
.configureIntegrationTests(baseTestDeps ++ kafkaConnectHttpTestDeps)
//.configureFunctionalTests(kafkaConnectS3FuncTestDeps)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the functional tests are to be added later? Do we need that line then?

import io.lenses.streamreactor.connect.cloud.common.sink.config.FlushSettings.defaultFlushInterval
import io.lenses.streamreactor.connect.cloud.common.sink.config.FlushSettings.defaultFlushSize

object CloudCommitPolicy extends LazyLogging {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need to extend LazyLogging here right ?

}.unsafeRunSync()
()
}
_ <- IO.delay(scheduler.scheduleAtFixedRate(runnable, 0, uploadSyncPeriod.toLong, TimeUnit.SECONDS))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's have a look at this on Monday maybe.

Unfortunately, doing it like this means that it will be ever running and cannot be stopped.
Running the test from HttpSinkTaskIT you can see that even tho the test terminates this process keeps running until the jvm is terminated. Ideally we would want to have a "stop" method that could interrupt this scheduling.

@davidsloan davidsloan merged commit e4e11e6 into master Feb 13, 2024
138 checks passed
@davidsloan davidsloan deleted the feat/http-sink branch February 13, 2024 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants