-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] enhancement(sinks clickhouse): support native protocol #14787
Conversation
… into feature-native-ch-proto
✅ Deploy Preview for vector-project ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
…feature-native-ch-proto
…feature-native-ch-proto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey thanks for the contribution!!
I left some initial comments below but it looks good overall.
In addition to the below comments, further steps for this PR would be:
- Adding an integration test
- Updating the cue docs in
website/cue/reference/components/sinks/clickhouse.cue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution, @caibirdme The basic form of this looks sound. Props on using nom
to parse the type descriptions and the unit tests on that parsing. This will need to be run through fmt
and clippy
to pass our checks, and I left a number of comments below to possibly tighten up the implementation.
✅ Deploy Preview for vrl-playground canceled.
|
Soak Test ResultsBaseline: 9a7c16a ExplanationA soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core. The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed. No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%: Fine details of change detection per experiment.
|
Soak Test ResultsBaseline: 9a7c16a ExplanationA soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core. The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed. No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%: Fine details of change detection per experiment.
|
Please let us know when this is ready to re-review. This appears to be failing the clippy checks. You can run these in your dev environment with |
I'll fix all the style issues few in a few hours and comment here, then you can continue re-review |
Soak Test ResultsBaseline: 6cafa53 ExplanationA soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core. The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed. No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%: Fine details of change detection per experiment.
|
Soak Test ResultsBaseline: 6cafa53 ExplanationA soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core. The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed. No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%: Fine details of change detection per experiment.
|
Soak Test ResultsBaseline: cb996f3 ExplanationA soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core. The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed. No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%: Fine details of change detection per experiment.
|
…feature-native-ch-proto
…feature-native-ch-proto
@bruceg I fixed all the issues from your previous review, please re-review now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is coming along nicely! I have a couple follow-up comments below. Also:
-
It looks like there is a cue formatting error, probably related to whitespace. You can check this locally with
make check-docs
. What I sometimes do is copy a line in the same file I haven't modified, which has the same level of indentation I need, and edit the text of that line 🤷♂️ I can take a look at that if needed. -
I think there is enough differences in these changes from the HTTP clickhouse sink, to warrant adding some integration test cases for it. We already have integration test module and docker-compose file for the HTTP clickhouse sink that this would extend. From the looks of the docker-compose file for that, I believe you could just copy-past the
src/sinks/clickhouse/integration_tests.rs
file intosrc/sinks/clickhouse/native/
, and replace the file's contents some native-specific integration tests there. The most basic being something like theinsert_events()
test case, but for the native protocol, which would verify the "happy path".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code is looking pretty good. I have a few more suggestions, but none of the code questions are correctness issues. Note also that the docs are failing the cue format tool:
https://github.com/vectordotdev/vector/actions/runs/3349954520/jobs/5550414838#step:14:30
@@ -236,13 +236,16 @@ bollard = { version = "0.13.0", default-features = false, features = ["ssl", "ch | |||
bytes = { version = "1.2.1", default-features = false, features = ["serde"] } | |||
bytesize = { version = "1.1.0", default-features = false } | |||
chrono = { version = "0.4.22", default-features = false, features = ["serde"] } | |||
chrono-tz = { version = "0.6", default-features = false, features = ["serde"], optional = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to avoid chrono-tz
version 0.7 that we are using elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is due to the issue in clickhouse-rs
: suharev7/clickhouse-rs#158
I suggest for now leaving it as is before fixing it in the upstream. Then, we will be able to (hopefully easily) update the chrono-tz
library. According to the linked issue, I do not know, how easily fix the current code to work with the latest chrono-tz
(and I support clickhouse-rs
does not work with it well anyway).
@caibirdme Do you plan to continue work on the PR? Would be really awesome to get this PR merged :) |
@zamazan4ik Yeah, but I'm too busy these days, always get off work at 23:00... I'll refocus on this pr maybe next week |
@caibirdme Did you able to find free time for the PR? :) |
@neuronull @bruceg @jszwedko I guess I could help finish this PR and resolve the left issues. What do you think about the idea? |
👍 that seems fine since it seems like @caibirdme may not have the time to push this forward now. I'd suggest creating a new PR based on this branch to address the feedback. |
Cue formatting fixed. |
Closing in-lieu of #15660 |
Since this pr is quite big, so I make this pr in advance, and want to get some suggestions from you.
example: