Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add settings to allow distributed_product_mode for trace panel #6517

Merged
merged 4 commits into from
Nov 22, 2024

Conversation

nityanandagohain
Copy link
Member

@nityanandagohain nityanandagohain commented Nov 22, 2024

Temporary fix for #6515


Important

Adds settings distributed_product_mode='allow' to SQL queries in buildTracesQuery() to handle distributed product mode errors and updates tests accordingly.

  • Behavior:
    • Adds settings distributed_product_mode='allow' to SQL query in buildTracesQuery() in query_builder.go to prevent distributed product mode errors.
  • Tests:
    • Updates expected SQL query strings in Test_buildTracesQuery in query_builder_test.go to include settings distributed_product_mode='allow'.

This description was created by Ellipsis for 60d663e. It will automatically update as commits are pushed.

@github-actions github-actions bot added the bug Something isn't working label Nov 22, 2024
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Reviewed everything up to 119d217 in 47 seconds

More details
  • Looked at 41 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. pkg/query-service/app/traces/v4/query_builder.go:259
  • Draft comment:
    Ensure that the distributed_product_mode='allow' setting is consistently applied across all relevant queries to avoid potential inconsistencies or errors. This setting is added here as a temporary fix for a specific issue.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The addition of the setting distributed_product_mode='allow' is a temporary fix for a specific issue. However, it is important to ensure that this setting is applied consistently across all relevant queries to avoid potential inconsistencies or errors.

Workflow ID: wflow_g46GYFmfegDzdtKQ


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

pkg/query-service/app/traces/v4/query_builder.go Outdated Show resolved Hide resolved
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 8eab874 in 28 seconds

More details
  • Looked at 13 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. pkg/query-service/app/traces/v3/query_builder_test.go:1172
  • Draft comment:
    The ExpectedQuery for the test case "Test Noop trace view" should include settings distributed_product_mode='allow' to reflect the changes made in the PR. Ensure that all relevant test cases are updated accordingly.
  • Reason this comment was not posted:
    Comment did not seem useful.

Workflow ID: wflow_AMedHWBAr5wxaeaz


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@srikanthccv
Copy link
Member

srikanthccv commented Nov 22, 2024

Please help me with the complete query log info for one query. Run the prepared query and share all the query_log entries associated with the query_id on all shards. You can use saasmonitor.

@nityanandagohain
Copy link
Member Author

I ran it on shard 2 and the it's present only on shard 2 query log.

SELECT
    subQuery.serviceName,
    subQuery.name,
    count() AS span_count,
    subQuery.durationNano,
    subQuery.traceID AS traceID
FROM signoz_traces.distributed_signoz_index_v3
INNER JOIN
(
    SELECT *
    FROM
    (
        SELECT
            traceID,
            durationNano,
            serviceName,
            name
        FROM signoz_traces.signoz_index_v3
        WHERE (parentSpanID = '') AND ((timestamp >= '1732281648000000000') AND (timestamp <= '1732287658000000000')) AND ((ts_bucket_start >= 1732281648) AND (ts_bucket_start <= 1732287658)) AND (resource_fingerprint GLOBAL IN (
            SELECT fingerprint
            FROM signoz_traces.distributed_traces_v3_resource
            WHERE (seen_at_ts_bucket_start >= 1732026658) AND (seen_at_ts_bucket_start <= 1732287658) AND (simpleJSONExtractString(labels, 'service.name') IN ['nginx-ui']) AND (labels LIKE '%"service.name":"nginx-ui"%')
        ))
        ORDER BY durationNano DESC
        LIMIT 1 BY traceID
        LIMIT 10
    ) AS inner_subquery
) AS subQuery ON signoz_traces.distributed_signoz_index_v3.traceID = subQuery.traceID
WHERE ((timestamp >= '1732281648000000000') AND (timestamp <= '1732287658000000000')) AND ((ts_bucket_start >= 1732281648) AND (ts_bucket_start <= 1732287658))
GROUP BY
    subQuery.traceID,
    subQuery.durationNano,
    subQuery.name,
    subQuery.serviceName
ORDER BY subQuery.durationNano DESC
LIMIT 1 BY subQuery.traceID
SETTINGS distributed_product_mode = 'allow'
Row 1:
──────
hostname:                              chi-saasmonitor-us-clickhouse-cluster-2-0-0.chi-saasmonitor-us-clickhouse-cluster-2-0.saasmonitor.svc.cluster.local
type:                                  QueryFinish
event_date:                            2024-11-22
event_time:                            2024-11-22 16:08:37
event_time_microseconds:               2024-11-22 16:08:37.581536
query_start_time:                      2024-11-22 16:08:35
query_start_time_microseconds:         2024-11-22 16:08:35.328400
query_duration_ms:                     2253
read_rows:                             78570421
read_bytes:                            3799664488
written_rows:                          0
written_bytes:                         0
result_rows:                           30
result_bytes:                          232872
memory_usage:                          84031909
current_database:                      default
query:                                 SELECT
    subQuery.serviceName,
    subQuery.name,
    count() AS span_count,
    subQuery.durationNano,
    subQuery.traceID AS traceID
FROM signoz_traces.distributed_signoz_index_v3
INNER JOIN
(
    SELECT *
    FROM
    (
        SELECT
            traceID,
            durationNano,
            serviceName,
            name
        FROM signoz_traces.signoz_index_v3
        WHERE (parentSpanID = '') AND ((timestamp >= '1732281648000000000') AND (timestamp <= '1732287658000000000')) AND ((ts_bucket_start >= 1732281648) AND (ts_bucket_start <= 1732287658)) AND (resource_fingerprint GLOBAL IN (
            SELECT fingerprint
            FROM signoz_traces.distributed_traces_v3_resource
            WHERE (seen_at_ts_bucket_start >= 1732026658) AND (seen_at_ts_bucket_start <= 1732287658) AND (simpleJSONExtractString(labels, 'service.name') IN ['nginx-ui']) AND (labels LIKE '%"service.name":"nginx-ui"%')
        ))
        ORDER BY durationNano DESC
        LIMIT 1 BY traceID
        LIMIT 10
    ) AS inner_subquery
) AS subQuery ON signoz_traces.distributed_signoz_index_v3.traceID = subQuery.traceID
WHERE ((timestamp >= '1732281648000000000') AND (timestamp <= '1732287658000000000')) AND ((ts_bucket_start >= 1732281648) AND (ts_bucket_start <= 1732287658))
GROUP BY
    subQuery.traceID,
    subQuery.durationNano,
    subQuery.name,
    subQuery.serviceName
ORDER BY subQuery.durationNano DESC
LIMIT 1 BY subQuery.traceID
SETTINGS distributed_product_mode = 'allow'
formatted_query:
normalized_query_hash:                 11024418275330164455
query_kind:                            Select
databases:                             ['signoz_traces']
tables:                                ['signoz_traces.distributed_signoz_index_v3','signoz_traces.distributed_traces_v3_resource','signoz_traces.signoz_index_v3','signoz_traces.traces_v3_resource']
columns:                               ['signoz_traces.distributed_signoz_index_v3.timestamp','signoz_traces.distributed_signoz_index_v3.trace_id','signoz_traces.distributed_signoz_index_v3.ts_bucket_start','signoz_traces.distributed_traces_v3_resource.fingerprint','signoz_traces.distributed_traces_v3_resource.labels','signoz_traces.distributed_traces_v3_resource.seen_at_ts_bucket_start','signoz_traces.signoz_index_v3.`resource_string_service$$name`','signoz_traces.signoz_index_v3.duration_nano','signoz_traces.signoz_index_v3.name','signoz_traces.signoz_index_v3.parent_span_id','signoz_traces.signoz_index_v3.resource_fingerprint','signoz_traces.signoz_index_v3.timestamp','signoz_traces.signoz_index_v3.trace_id','signoz_traces.signoz_index_v3.ts_bucket_start','signoz_traces.traces_v3_resource.fingerprint','signoz_traces.traces_v3_resource.labels','signoz_traces.traces_v3_resource.seen_at_ts_bucket_start']
partitions:                            ['signoz_traces.signoz_index_v3.20241122','signoz_traces.traces_v3_resource.19700121']
projections:                           []
views:                                 []
exception_code:                        0
exception:
stack_trace:
is_initial_query:                      1
user:                                  default
query_id:                              39c5f0ed-429e-47aa-8fca-9b43a5ef6a30
address:                               ::ffff:127.0.0.1
port:                                  43740
initial_user:                          default
initial_query_id:                      39c5f0ed-429e-47aa-8fca-9b43a5ef6a30
initial_address:                       ::ffff:127.0.0.1
initial_port:                          43740
initial_query_start_time:              2024-11-22 16:08:35
initial_query_start_time_microseconds: 2024-11-22 16:08:35.328400
interface:                             1
is_secure:                             0
os_user:
client_hostname:                       chi-saasmonitor-us-clickhouse-cluster-2-0-0.chi-saasmonitor-us-clickhouse-cluster-2-0.saasmonitor.svc.cluster.local
client_name:                           ClickHouse client
client_revision:                       54466
client_version_major:                  24
client_version_minor:                  1
client_version_patch:                  2
http_method:                           0
http_user_agent:
http_referer:
forwarded_for:
quota_key:
distributed_depth:                     0
revision:                              54482
log_comment:
thread_ids:                            [741,1245,761,1243,740,903,848,1242,886,689,850,744,734,931,834,1327,736,914,717,814,1257,666,842,645,893,665,817,882,885,1258,864,667,764,770,929,872,678,857,827,730,769,2442,669,866,894,1336,745,648,751,659,852,658,846,752,721,918,724,713,765,771,1856,674,671,868,1866,702,984,828,925,731,825,909,1588,1259,758,1340,711,908,1296,879,1584,1839,657,1589,878,831,773,919,822,897,738,1320,912,1591,1585,1084,690,884,1854,874,777,824,662,1341,987]
peak_threads_usage:                    34
ProfileEvents:                         {'Query':1,'SelectQuery':1,'QueriesWithSubqueries':11,'SelectQueriesWithSubqueries':11,'FileOpen':34,'ReadBufferFromFileDescriptorReadBytes':462576798,'ReadCompressedBytes':455090799,'CompressedReadBufferBlocks':6845,'CompressedReadBufferBytes':1259690948,'UncompressedCacheMisses':5,'UncompressedCacheWeightLost':133792,'OpenedFileCacheHits':65,'OpenedFileCacheMisses':34,'OpenedFileCacheMicroseconds':194,'IOBufferAllocs':205,'IOBufferAllocBytes':26577188,'ArenaAllocChunks':80,'ArenaAllocBytes':327680,'FunctionExecute':8399,'MarkCacheHits':95,'MarkCacheMisses':2,'CreatedReadBufferOrdinary':99,'DiskReadElapsedMicroseconds':20909423,'NetworkReceiveElapsedMicroseconds':3083,'NetworkSendElapsedMicroseconds':2980,'NetworkReceiveBytes':251424,'NetworkSendBytes':381338,'DistributedConnectionTries':4,'DistributedConnectionUsable':4,'SuspendSendingQueryToShard':4,'SelectedParts':5,'SelectedRanges':35,'SelectedMarks':3340,'SelectedRows':78570421,'SelectedBytes':3799664488,'WaitMarksLoadMicroseconds':1168,'LoadedMarksCount':7,'LoadedMarksMemoryBytes':208,'ContextLock':418,'RWLockAcquiredReadLocks':16,'PartsLockHoldMicroseconds':158,'RealTimeMicroseconds':60807182,'UserTimeMicroseconds':3097331,'SystemTimeMicroseconds':635124,'SoftPageFaults':9705,'OSCPUWaitMicroseconds':30559,'OSCPUVirtualTimeMicroseconds':3730683,'OSReadBytes':298467328,'OSReadChars':464093517,'OSWriteChars':22592,'QueryProfilerRuns':62,'ThreadPoolReaderPageCacheHit':1464,'ThreadPoolReaderPageCacheHitBytes':148541648,'ThreadPoolReaderPageCacheHitElapsedMicroseconds':54368,'ThreadPoolReaderPageCacheMiss':2404,'ThreadPoolReaderPageCacheMissBytes':314035150,'ThreadPoolReaderPageCacheMissElapsedMicroseconds':20855055,'SynchronousReadWaitMicroseconds':21354140}
Settings:                              {'max_threads':'16','connect_timeout_with_failover_ms':'1000','distributed_aggregation_memory_efficient':'1','log_queries':'1','distributed_product_mode':'allow','parallel_view_processing':'1','allow_nondeterministic_mutations':'1','allow_experimental_window_functions':'1','default_database_engine':'Ordinary'}
used_aggregate_functions:              ['count']
used_aggregate_function_combinators:   []
used_database_engines:                 []
used_data_type_families:               ['Int64','FixedString','AggregateFunction','DateTime','DateTime64','String','LowCardinality','UInt64','Enum8']
used_dictionaries:                     []
used_formats:                          []
used_functions:                        ['equals','like','simpleJSONExtractString','greaterOrEquals','globalIn','in','lessOrEquals','globalInIgnoreSet','divide','and']
used_storages:                         []
used_table_functions:                  []
used_row_policies:                     []
transaction_id:                        (0,0,'00000000-0000-0000-0000-000000000000')
query_cache_usage:                     None
asynchronous_read_counters:            {}

@srikanthccv
Copy link
Member

FYI: You should use initial_query_id when searching across the shards.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 60d663e in 29 seconds

More details
  • Looked at 27 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. pkg/query-service/app/traces/v4/query_builder_test.go:534
  • Draft comment:
    The test case for PanelTypeTrace includes the new setting max_memory_usage=10000000000, but ensure that similar settings are applied to other relevant test cases, such as for PanelTypeList, if applicable.
  • Reason this comment was not posted:
    Comment did not seem useful.

Workflow ID: wflow_LJp5bHyskzDBi4yd


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@nityanandagohain nityanandagohain merged commit 0c2a15d into develop Nov 22, 2024
15 checks passed
@nityanandagohain nityanandagohain deleted the fix/issue_6515 branch November 22, 2024 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants