HPCC-32734 Add ElasticSearch metric sink configuration #19247

kenrowland · 2024-10-29T18:07:07Z

Added configuration options for the ElasticSearch sink Updated metrics readme with information on adding the sink to the cluster Added example ElasticSearch configuraion yaml file

Signed-Off-By: Kenneth Rowland [email protected]

Type of change:

This change is a bug fix (non-breaking change which fixes an issue).
This change is a new feature (non-breaking change which adds functionality).
This change improves the code (refactor or other change that does not change the functionality)
This change fixes warnings (the fix does not alter the functionality or the generated code)
This change is a breaking change (fix or feature that will cause existing behavior to change).
This change alters the query API (existing queries will have to be recompiled)

Checklist:

Smoketest:

Send notifications about my Pull Request position in Smoketest queue.
Test my draft Pull Request.

Testing:

Added configuration options for the ElasticSearch sink Updated metrics readme with information on adding the sink to the cluster Added example ElasticSearch configuraion yaml file Signed-Off-By: Kenneth Rowland [email protected]

github-actions · 2024-10-29T19:39:56Z

Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-32734

Jirabot Action Result:
Workflow Transition To: Merge Pending
Updated PR

rpastrana

@kenrowland looks good, but I do have some comments for you to ponder. They're all meant to be constructive, let me know if should clarify any points. Thanks.

rpastrana · 2024-10-29T21:28:43Z

helm/examples/metrics/README.md

+data types to be stored in the index in their native types. Without dynamic mapping, the ElasticSearch
+default mapping does not properly map value to unsigned 64-bit integers. 
+
+To create an index with dynamic mapping, use the following object when creating the index:


this write up seems very useful.
Is the goal still to handle the creation of the indices programmatically in later versions of the sink?

Currently no plans to create the index from the sink. If that was something we added, there are several considerations that must be taken into account. These are

If the cluster is using the same index for all metric reporting, creating and configuring the index must be coordinated somehow. If each component creates their own, there still needs to be coordination.

The sink would also need to add mapping templates in order to map values to the correct type. This could still use wildcard dynamic templates, or could be specific to the metrics registered for each metric framework instance

If the sink creates the index, we need to worry about life cycle management.

I felt like just telling the sink what index to use allows the policy of the ElasticSearch instance to remain there instead of pushing it down to the sink.

If we decide to push that responsibility to the sink, we can add it as an improvement addressing the concerns from above

rpastrana · 2024-10-29T21:32:21Z

helm/examples/metrics/README.md

+<Environment>
+    <Software>
+        <metrics name="mymetricsconfig">
+            <sinks name="elastic" type="elastic">


trivial but changing the name to 'myelasticsink' would drive to point further and would follow the convention from the metrics name attribute above.

Sounds reasonable. I'll change that.

rpastrana · 2024-10-30T12:54:10Z

helm/examples/metrics/README.md

+    <Software>
+        <metrics name="mymetricsconfig">
+            <sinks name="elastic" type="elastic">
+                <settings period="30" 


The challenge of adding the configuration before the logic is higher potential for downstream restructure of configuration which can be very disruptive to end-users.

It's obviously hard to know what we don't know we'll need to support in later versions, but let's scrutinize each element of your config layout more than usual...
for instance, indexName, if we're planning on expanding the index creation, we might need a pattern rather than a name. with that in mind it might be worth supporting a structured element for the index, for now the only active attribute for the index element could be "name", later adding more attributes under the index element

Index config changed accordingly to allow sub items.

rpastrana · 2024-10-30T12:55:30Z

helm/examples/metrics/elasticsearch_metrics.yaml

+#   type                                - sink type (must be elastic for ElasticSearch support)
+#   name                                - name for the sink instance
+#   settings.elasticHost                - url for the ElasticSearch instance
+#   settings.indexName                  - ElasticSearch index name where metrics are indexed


to illustrate what I meant above, this could be settings.index.name

Reasonable. Change made. It also would allow keeping the index configuration separate in case we decide to push more policy into the sink.

rpastrana · 2024-10-30T12:58:25Z

system/metrics/sinks/elastic/elasticSink.cpp

@@ -45,11 +45,26 @@ ElasticMetricSink::ElasticMetricSink(const char *name, const IPropertyTree *pSet
    ignoreZeroMetrics = pSettingsTree->getPropBool("@ignoreZeroMetrics", true);
    pSettingsTree->getProp("@elasticHost", elasticHost);
    pSettingsTree->getProp("@indexName", indexName);
+
+    // Initialize standard suffixes
+    countMetricSuffix.append("_count");


is there an advantage to use append rather than set? (I'm obviously assuming these suffix strings are empty before line 50)

Yes they are empty. Use of append is based on code review comments from a previous PR from others on the team.

rpastrana · 2024-10-30T13:00:00Z

helm/examples/metrics/elasticsearch_metrics.yaml

+    - type: elastic
+      name: elasticsink
+      settings:
+        elasticHost: http://localhost:9200,


perhaps too picky, but "Host" implies the dns entry only, no protocol, port, etc.

For this I wanted to provide the full string to which the specific endpoint is appended. This makes it a bit easier than having to read multiple configuration values just to put them together internally. Any individual part can be changed in the string as easily as it could be in a separate config value.

However, thinking ahead, I can see where something like the port or other portion of the route could be a cluster wide parameter. For that reason, separating them into separate values makes sense.

kenrowland

Pleas see comments

kenrowland · 2024-10-30T13:34:36Z

helm/examples/metrics/README.md

+data types to be stored in the index in their native types. Without dynamic mapping, the ElasticSearch
+default mapping does not properly map value to unsigned 64-bit integers. 
+
+To create an index with dynamic mapping, use the following object when creating the index:


Currently no plans to create the index from the sink. If that was something we added, there are several considerations that must be taken into account. These are

If the cluster is using the same index for all metric reporting, creating and configuring the index must be coordinated somehow. If each component creates their own, there still needs to be coordination.

The sink would also need to add mapping templates in order to map values to the correct type. This could still use wildcard dynamic templates, or could be specific to the metrics registered for each metric framework instance

If the sink creates the index, we need to worry about life cycle management.

I felt like just telling the sink what index to use allows the policy of the ElasticSearch instance to remain there instead of pushing it down to the sink.

If we decide to push that responsibility to the sink, we can add it as an improvement addressing the concerns from above

kenrowland · 2024-10-30T13:40:44Z

helm/examples/metrics/README.md

+<Environment>
+    <Software>
+        <metrics name="mymetricsconfig">
+            <sinks name="elastic" type="elastic">


Sounds reasonable. I'll change that.

kenrowland · 2024-10-30T13:47:46Z

helm/examples/metrics/elasticsearch_metrics.yaml

+    - type: elastic
+      name: elasticsink
+      settings:
+        elasticHost: http://localhost:9200,


For this I wanted to provide the full string to which the specific endpoint is appended. This makes it a bit easier than having to read multiple configuration values just to put them together internally. Any individual part can be changed in the string as easily as it could be in a separate config value.

However, thinking ahead, I can see where something like the port or other portion of the route could be a cluster wide parameter. For that reason, separating them into separate values makes sense.

kenrowland · 2024-10-30T13:49:35Z

system/metrics/sinks/elastic/elasticSink.cpp

@@ -45,11 +45,26 @@ ElasticMetricSink::ElasticMetricSink(const char *name, const IPropertyTree *pSet
    ignoreZeroMetrics = pSettingsTree->getPropBool("@ignoreZeroMetrics", true);
    pSettingsTree->getProp("@elasticHost", elasticHost);
    pSettingsTree->getProp("@indexName", indexName);
+
+    // Initialize standard suffixes
+    countMetricSuffix.append("_count");


Yes they are empty. Use of append is based on code review comments from a previous PR from others on the team.

kenrowland · 2024-10-31T19:40:05Z

helm/examples/metrics/elasticsearch_metrics.yaml

+#   type                                - sink type (must be elastic for ElasticSearch support)
+#   name                                - name for the sink instance
+#   settings.elasticHost                - url for the ElasticSearch instance
+#   settings.indexName                  - ElasticSearch index name where metrics are indexed


Reasonable. Change made. It also would allow keeping the index configuration separate in case we decide to push more policy into the sink.

HPCC-32734 Add ElasticSearch metric sink configuration

015cf20

Added configuration options for the ElasticSearch sink Updated metrics readme with information on adding the sink to the cluster Added example ElasticSearch configuraion yaml file Signed-Off-By: Kenneth Rowland [email protected]

kenrowland requested a review from rpastrana October 29, 2024 19:23

rpastrana reviewed Oct 30, 2024

View reviewed changes

Addressed review comments

885790b

kenrowland commented Oct 31, 2024

View reviewed changes

kenrowland requested a review from rpastrana October 31, 2024 19:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HPCC-32734 Add ElasticSearch metric sink configuration #19247

HPCC-32734 Add ElasticSearch metric sink configuration #19247

kenrowland commented Oct 29, 2024 •

edited

Loading

github-actions bot commented Oct 29, 2024

rpastrana left a comment

rpastrana Oct 29, 2024

kenrowland Oct 30, 2024

rpastrana Oct 29, 2024

kenrowland Oct 30, 2024

rpastrana Oct 30, 2024

kenrowland Oct 31, 2024

rpastrana Oct 30, 2024

kenrowland Oct 31, 2024

rpastrana Oct 30, 2024

kenrowland Oct 30, 2024

rpastrana Oct 30, 2024

kenrowland Oct 30, 2024

kenrowland left a comment

kenrowland Oct 30, 2024

kenrowland Oct 30, 2024

kenrowland Oct 30, 2024

kenrowland Oct 30, 2024

kenrowland Oct 31, 2024

HPCC-32734 Add ElasticSearch metric sink configuration #19247

Are you sure you want to change the base?

HPCC-32734 Add ElasticSearch metric sink configuration #19247

Conversation

kenrowland commented Oct 29, 2024 • edited Loading

Type of change:

Checklist:

Smoketest:

Testing:

github-actions bot commented Oct 29, 2024

rpastrana left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kenrowland left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kenrowland commented Oct 29, 2024 •

edited

Loading