-
Notifications
You must be signed in to change notification settings - Fork 302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HPCC-32734 Add ElasticSearch metric sink configuration #19247
base: master
Are you sure you want to change the base?
Conversation
Added configuration options for the ElasticSearch sink Updated metrics readme with information on adding the sink to the cluster Added example ElasticSearch configuraion yaml file Signed-Off-By: Kenneth Rowland [email protected]
Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-32734 Jirabot Action Result: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kenrowland looks good, but I do have some comments for you to ponder. They're all meant to be constructive, let me know if should clarify any points. Thanks.
data types to be stored in the index in their native types. Without dynamic mapping, the ElasticSearch | ||
default mapping does not properly map value to unsigned 64-bit integers. | ||
|
||
To create an index with dynamic mapping, use the following object when creating the index: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this write up seems very useful.
Is the goal still to handle the creation of the indices programmatically in later versions of the sink?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently no plans to create the index from the sink. If that was something we added, there are several considerations that must be taken into account. These are
- If the cluster is using the same index for all metric reporting, creating and configuring the index must be coordinated somehow. If each component creates their own, there still needs to be coordination.
- The sink would also need to add mapping templates in order to map values to the correct type. This could still use wildcard dynamic templates, or could be specific to the metrics registered for each metric framework instance
- If the sink creates the index, we need to worry about life cycle management.
I felt like just telling the sink what index to use allows the policy of the ElasticSearch instance to remain there instead of pushing it down to the sink.
If we decide to push that responsibility to the sink, we can add it as an improvement addressing the concerns from above
helm/examples/metrics/README.md
Outdated
<Environment> | ||
<Software> | ||
<metrics name="mymetricsconfig"> | ||
<sinks name="elastic" type="elastic"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trivial but changing the name to 'myelasticsink' would drive to point further and would follow the convention from the metrics name attribute above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable. I'll change that.
helm/examples/metrics/README.md
Outdated
<Software> | ||
<metrics name="mymetricsconfig"> | ||
<sinks name="elastic" type="elastic"> | ||
<settings period="30" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The challenge of adding the configuration before the logic is higher potential for downstream restructure of configuration which can be very disruptive to end-users.
It's obviously hard to know what we don't know we'll need to support in later versions, but let's scrutinize each element of your config layout more than usual...
for instance, indexName, if we're planning on expanding the index creation, we might need a pattern rather than a name. with that in mind it might be worth supporting a structured element for the index, for now the only active attribute for the index element could be "name", later adding more attributes under the index element
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Index config changed accordingly to allow sub items.
# type - sink type (must be elastic for ElasticSearch support) | ||
# name - name for the sink instance | ||
# settings.elasticHost - url for the ElasticSearch instance | ||
# settings.indexName - ElasticSearch index name where metrics are indexed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to illustrate what I meant above, this could be settings.index.name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reasonable. Change made. It also would allow keeping the index configuration separate in case we decide to push more policy into the sink.
@@ -45,11 +45,26 @@ ElasticMetricSink::ElasticMetricSink(const char *name, const IPropertyTree *pSet | |||
ignoreZeroMetrics = pSettingsTree->getPropBool("@ignoreZeroMetrics", true); | |||
pSettingsTree->getProp("@elasticHost", elasticHost); | |||
pSettingsTree->getProp("@indexName", indexName); | |||
|
|||
// Initialize standard suffixes | |||
countMetricSuffix.append("_count"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there an advantage to use append rather than set? (I'm obviously assuming these suffix strings are empty before line 50)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes they are empty. Use of append is based on code review comments from a previous PR from others on the team.
- type: elastic | ||
name: elasticsink | ||
settings: | ||
elasticHost: http://localhost:9200, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps too picky, but "Host" implies the dns entry only, no protocol, port, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this I wanted to provide the full string to which the specific endpoint is appended. This makes it a bit easier than having to read multiple configuration values just to put them together internally. Any individual part can be changed in the string as easily as it could be in a separate config value.
However, thinking ahead, I can see where something like the port or other portion of the route could be a cluster wide parameter. For that reason, separating them into separate values makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pleas see comments
data types to be stored in the index in their native types. Without dynamic mapping, the ElasticSearch | ||
default mapping does not properly map value to unsigned 64-bit integers. | ||
|
||
To create an index with dynamic mapping, use the following object when creating the index: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently no plans to create the index from the sink. If that was something we added, there are several considerations that must be taken into account. These are
- If the cluster is using the same index for all metric reporting, creating and configuring the index must be coordinated somehow. If each component creates their own, there still needs to be coordination.
- The sink would also need to add mapping templates in order to map values to the correct type. This could still use wildcard dynamic templates, or could be specific to the metrics registered for each metric framework instance
- If the sink creates the index, we need to worry about life cycle management.
I felt like just telling the sink what index to use allows the policy of the ElasticSearch instance to remain there instead of pushing it down to the sink.
If we decide to push that responsibility to the sink, we can add it as an improvement addressing the concerns from above
helm/examples/metrics/README.md
Outdated
<Environment> | ||
<Software> | ||
<metrics name="mymetricsconfig"> | ||
<sinks name="elastic" type="elastic"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable. I'll change that.
- type: elastic | ||
name: elasticsink | ||
settings: | ||
elasticHost: http://localhost:9200, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this I wanted to provide the full string to which the specific endpoint is appended. This makes it a bit easier than having to read multiple configuration values just to put them together internally. Any individual part can be changed in the string as easily as it could be in a separate config value.
However, thinking ahead, I can see where something like the port or other portion of the route could be a cluster wide parameter. For that reason, separating them into separate values makes sense.
@@ -45,11 +45,26 @@ ElasticMetricSink::ElasticMetricSink(const char *name, const IPropertyTree *pSet | |||
ignoreZeroMetrics = pSettingsTree->getPropBool("@ignoreZeroMetrics", true); | |||
pSettingsTree->getProp("@elasticHost", elasticHost); | |||
pSettingsTree->getProp("@indexName", indexName); | |||
|
|||
// Initialize standard suffixes | |||
countMetricSuffix.append("_count"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes they are empty. Use of append is based on code review comments from a previous PR from others on the team.
# type - sink type (must be elastic for ElasticSearch support) | ||
# name - name for the sink instance | ||
# settings.elasticHost - url for the ElasticSearch instance | ||
# settings.indexName - ElasticSearch index name where metrics are indexed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reasonable. Change made. It also would allow keeping the index configuration separate in case we decide to push more policy into the sink.
Added configuration options for the ElasticSearch sink Updated metrics readme with information on adding the sink to the cluster Added example ElasticSearch configuraion yaml file
Signed-Off-By: Kenneth Rowland [email protected]
Type of change:
Checklist:
Smoketest:
Testing: