Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High cpu usage on clickhouse pod #381

Open
sati-max opened this issue Jan 17, 2024 · 9 comments
Open

High cpu usage on clickhouse pod #381

sati-max opened this issue Jan 17, 2024 · 9 comments

Comments

@sati-max
Copy link

Hi

We are using SigNoz in k8s env and we notice that the clickhouse pod has high CPU utilization that doesn't correspond with increased data being send to SigNoz (the data flow is rather constant during the period of time when clickhouse CPU is very high, going even 100% cpu, k8s worker has 8vcpus and due to clickhouse pod high cpu usage it also uses up to 100% available cpu).

How can I debug what is "eating" that CPU (some query from dashboard?, something is being processed in the background of clickhouse?, some rogue query can't finish and doesn't get properly terminated?) This is the top command from the clickhouse pod:
image

We are using default values from your provided values.yaml file

Or I should direct this issue to Clickhouse developers?

@srikanthccv
Copy link
Member

How many dashboards and alerts do you have configured?

@sati-max
Copy link
Author

13 dashboards (5, 6, 4, 4, 4, 5, 5, 5, 14, 14, 14, 13 individual charts)
0 alerts

@srikanthccv
Copy link
Member

Can you share details about the ingestion volume?

@sati-max
Copy link
Author

sati-max commented Feb 3, 2024

Yes, can you tell me how to check it so that the info is correct and not like my guess on it?

@srikanthccv
Copy link
Member

@sati-max
Copy link
Author

sati-max commented Feb 6, 2024

I imported the dashboard. What data I've should send/paste here?

@srikanthccv
Copy link
Member

Share the accepted spans/metrics/logs rate details and the CPU usage of ClickHouse.

@sati-max
Copy link
Author

Hi

1st SigNoz setup (no spans)

CPU
image

Metrics
image

Logs
image

Exporter DB write latency
image
image

Exporter DB writes/s
image
image

2nd SigNoz setup (no spans)

CPU
image

Metrics
image

Logs
image

Exporter DB write latency
image
image

Exporter DB writes/s
image
image

I don't see the option to get clickhouse cpu usage from this dashboard, so I added Exporter DB write latency and Exporter DB writes/s. This is the options in $component:
image

Thank you, have a good day

@sati-max
Copy link
Author

sati-max commented Mar 8, 2024

Hi

Any update on this issue?

Additional info:
We also noticed that around 1AM our HV shows drastic drop of CPU usage and then it grows gradually to again drop around 1AM.

image

Looking inside clickhouse logs there doesn't seem any sort of info that any process (processes) are being restarted, there is only info about dropping empty parts:

One clickhouse pod:
2024.03.07 23:59:44.294105 [ 621 ] {} <Information> signoz_metrics.samples_v2 (5524edce-ebfa-4ce2-8a5b-2024fe7384d2): Will drop empty part 20240206_16711965_16763541_528
2024-03-08T01:00:00.837317172+01:00 2024.03.08 00:00:00.837215 [ 623 ] {} <Information> signoz_metrics.time_series_v4_1day (42aa0288-cee9-4ab8-b539-983eae65956b): Will drop empty part 20240207_1235706_1246713_848
2024-03-08T01:00:02.255293820+01:00 2024.03.08 00:00:02.255198 [ 622 ] {} <Information> system.session_log (504a6b73-b33d-43ee-b7bb-6d18a7fb5b1f): Will drop empty part 20240207_59628_59725_18
2024-03-08T01:00:02.410201796+01:00 2024.03.08 00:00:02.410120 [ 623 ] {} <Information> system.trace_log (fd303ced-2c16-4a40-a7c3-3bc322f1a7ba): Will drop empty part 20240301_134614_146127_116
2024-03-08T01:13:48.462419143+01:00 2024.03.08 00:13:48.459673 [ 621 ] {} <Information> system.zookeeper_log_0 (6fdf0680-73d1-43ba-80e6-c67aa22b3ffe): Will drop empty part 20240207_79658_79729_19
2024-03-08T01:14:51.143096264+01:00 2024.03.08 00:14:51.142997 [ 620 ] {} <Information> system.asynchronous_metric_log_2 (e90b1311-448a-4247-880e-e6ebcff51e4e): Will drop empty part 20240207_106338_106418_24
2024.03.08 00:17:16.594846 [ 620 ] {} <Information> system.metric_log_3 (54a0f04c-55c9-4888-bb2b-4df991dabcc1): Will drop empty part 20240207_106007_106100_24
2024-03-08T01:18:59.543631770+01:00 2024.03.08 00:18:59.543536 [ 605 ] {} <Information> system.query_log_2 (22821b2a-0dd6-4380-be94-ad1141c9fe96): Will drop empty part 20240207_254700_254761_5
2024-03-08T01:19:16.422328170+01:00 2024.03.08 00:19:16.422233 [ 624 ] {} <Information> system.part_log_2 (df835ac4-1326-4f2e-94ca-f4a9e9226303): Will drop empty part 20240207_256730_256817_18

Second clickhouse pod (different signoz)

2024-03-08T00:59:52.437603466+01:00 2024.03.07 23:59:52.437502 [ 624 ] {} <Information> signoz_metrics.samples_v2 (f23e59e9-9a75-49cb-81e2-4a63b74ccc86): Will drop empty part 20240107_5078376_5192579_1474
2024-03-08T00:59:52.722735845+01:00 2024.03.07 23:59:52.722630 [ 624 ] {} <Information> signoz_metrics.samples_v4 (84e67471-6a73-4c25-b1d3-05755a205559): Will drop empty part 20240206_1274890_1385727_1040
2024-03-08T01:00:01.742916974+01:00 2024.03.08 00:00:01.742823 [ 623 ] {} <Info rmation> signoz_metrics.time_series_v4_1day (367054a7-c948-4954-b45c-0fce07e5a765): Will drop empty part 20240207_1385665_1496389_22689
2024-03-08T01:00:02.162485336+01:00 2024.03.08 00:00:02.162389 [ 621 ] {} <Information> system.trace_log (05b3f27f-dc0c-4edb-b857-c485efa6e666): Will drop empty part 20240301_408281_419793_63
2024-03-08T01:00:02.575463472+01:00 2024.03.08 00:00:02.575374 [ 623 ] {} <Information> system.query_views_log (8c181ce3-a41c-4510-8149-7fef6c5d7c00): Will drop empty part 20240222_315678_327145_649
2024-03-08T01:00:02.772456957+01:00 2024.03.08 00:00:02.772357 [ 621 ] {} <Information> system.session_log (2598f85e-8191-4879-91db-b313b386180f): Will drop empty part 20240207_80155_88510_5196

I don't see anything in crontabs of clickhouse (/etc/crontabs , /etc/periodic).

The clickhouse pod also isn't being restarted by kubernetes.

Cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants