Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client Telemetry: Adds Client Config API support to control the feature #3816

Closed
wants to merge 32 commits into from

Conversation

sourabh1007
Copy link
Contributor

@sourabh1007 sourabh1007 commented Apr 18, 2023

Description

Background

Till now, Client Telemetry feature was controlled by environment variable. It means customer has to set a particular environment variable in order to enable client telemetry on SDK. but ideally, it should be controlled by portal. In portal, there will be a switch to enable/disable client telemetry. And SDK will get this information using a gateway API (similar to database account properties) and act accordingly.

As part of this PR, Gateway API is integrated, and environment variable config is removed. SDK will call gateway API in every 10 minutes to refresh this information.

What is included in this PR?

  1. Removed existing environment variable related to Client Telemetry.
  2. Made sure Client Config API is called during initialization.
  3. Made sure there is a background job running which keep checking the status of client telemetry config every 10min.
  4. Made sure client telemetry job is starting and stopping with the flag without any re-initialization.
  5. There is only one way to enable client telemetry i.e. through portal but it can be disabled by 2 ways:
    a) Instance level: by calling client builder function or setting CosmosClientOptions.EnableClientTelemetry to false.
    b) Machine level : by setting environment variable COSMOS.CLIENT_TELEMETRY_ENABLED to false. As per discussion with team, it is not required we can recommend customer to switch off client telemetry completely.
  6. Fixing references in related projects like Benchmark and CTL.

Public API

Not part of this PR

Design Summary

During initialization of client, starting a Task in DoumentClient which check for Account Client Config and initialize client telemetry job if it is enabled(ref. TelemetryToServiceHelper). This job will dynamically update the clientTelemetryInstance if client telemetry job is running.

Test Coverage

  1. Updated existing ClientTelemetryTest Emulator test, to mock ClientConfig API and always return enabled state of client telemetry
    2. Added a Test in GatewayTests to validate ClientConfig API with gateway end point
    3. Added logic to access ClientConfig API using AAD in CosmosAadTests.cs
  2. Added tests to validate client telemetry is switching off and switching on with API response on refresh. ClientTelemetryConfigurationTest.cs

Type of change

  • [] New feature (non-breaking change which adds functionality)

Closing issues

To automatically close an issue: closes #3427

@sourabh1007 sourabh1007 force-pushed the user/sourabh/clientconfigAPIIntegration branch 2 times, most recently from 99986f5 to 570a79a Compare April 27, 2023 18:57
@sourabh1007 sourabh1007 force-pushed the user/sourabh/clientconfigAPIIntegration branch from 570a79a to d8589f3 Compare May 8, 2023 13:43
@sourabh1007 sourabh1007 marked this pull request as draft May 8, 2023 19:23
@sourabh1007 sourabh1007 force-pushed the user/sourabh/clientconfigAPIIntegration branch 8 times, most recently from fa8bc3a to 31cd5b0 Compare May 11, 2023 18:00
@sourabh1007 sourabh1007 force-pushed the user/sourabh/clientconfigAPIIntegration branch from 31cd5b0 to 28d5218 Compare May 16, 2023 19:05
@sourabh1007 sourabh1007 marked this pull request as ready for review May 16, 2023 19:07
@sourabh1007 sourabh1007 force-pushed the user/sourabh/clientconfigAPIIntegration branch from 28d5218 to 0dc371c Compare May 18, 2023 15:11
Microsoft.Azure.Cosmos/src/DocumentClient.cs Outdated Show resolved Hide resolved
Microsoft.Azure.Cosmos/src/DocumentClient.cs Outdated Show resolved Hide resolved
Microsoft.Azure.Cosmos/src/DocumentClient.cs Outdated Show resolved Hide resolved
Microsoft.Azure.Cosmos/src/DocumentClient.cs Outdated Show resolved Hide resolved
Microsoft.Azure.Cosmos/src/Handler/TelemetryHandler.cs Outdated Show resolved Hide resolved
Microsoft.Azure.Cosmos/src/Handler/TelemetryHandler.cs Outdated Show resolved Hide resolved
Microsoft.Azure.Cosmos/src/Handler/TelemetryHandler.cs Outdated Show resolved Hide resolved
Microsoft.Azure.Cosmos/src/IDocumentClient.cs Outdated Show resolved Hide resolved
@sourabh1007 sourabh1007 force-pushed the user/sourabh/clientconfigAPIIntegration branch from 1c92a2f to 0b8c275 Compare August 7, 2023 01:10
@kirankumarkolli
Copy link
Member

kirankumarkolli commented Aug 7, 2023

Summary of offline sync-up:

Functional:

  • Post disable detection: near time (possible some delay) stop collection
  • The workload characteristics to be uniform: Spiky exceptions (possibel high CPU)
    • Few exception: Is okey
    • Exceptions of order of requests: Is not okey (10K rps)

C# model:

  • Ideal: Not to have pattern accessing Disposed Object

GoodToHave:

  • [Dispose] ClientTelemetry object ASAP (assuming limited instances or single instance)

Option1:
- RealCollector & NoOpCollector
- this.clientTelemetry: Keep switching between them?
- Inflight ones: using any will get drained out

@sourabh1007 sourabh1007 force-pushed the user/sourabh/clientconfigAPIIntegration branch from a48b214 to 241db02 Compare August 8, 2023 14:05
@sourabh1007 sourabh1007 added Do Not Review Marks a PR in "work in progress" state. and removed auto-merge Enables automation to merge PRs labels Aug 14, 2023
@sourabh1007
Copy link
Contributor Author

As @kirankumarkolli suggested splitting this PR into small PRs:

  1. Collector abstraction (it seems this is independent separate change) right?
  2. Making Telemetry dynamically turn ON/OFF (handling the concurrency/dispose aspects) - Default OFF.
  3. Integrating the Account config (With default still OFF, i.e. customer opt-in)

@sourabh1007
Copy link
Contributor Author

CLosing this PR as it is covered as part of attached PRs

@sourabh1007 sourabh1007 closed this Sep 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Do Not Review Marks a PR in "work in progress" state. Telemetry
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Cosmos account: Telemetry context (endpoint, enabled/disabled)
5 participants