diff --git a/doc/bulk_counter/bulk_counter.md b/doc/bulk_counter/bulk_counter.md index c3e963ad1e..fd63831d03 100644 --- a/doc/bulk_counter/bulk_counter.md +++ b/doc/bulk_counter/bulk_counter.md @@ -7,6 +7,7 @@ | Rev | Date | Author | Change Description | |:---:|:-----------:|:------------------:|-----------------------------------| | 0.1 | | Junchao Chen | Initial version | + | 0.2 | Dec 6, 2024 | Stephen Sun | Support setting bulk chunk size | ### Scope @@ -35,18 +36,34 @@ SONiC flex counter infrastructure shall utilize bulk stats API to gain better pe - In phase 1, the change is limited to syncd only, no CLI/swss change. Syncd shall deduce the bulk stats mode according to the stats mode defined in FLEX DB: - SAI_STATS_MODE_READ -> SAI_STATS_MODE_BULK_READ - SAI_STATS_MODE_READ_AND_CLEAR -> SAI_STATS_MODE_BULK_READ_AND_CLEAR +- Support setting bulk chunk size for the whole counter group or a sub set of counters. + + Sometimes it can be time consuming to poll a group of counters for all ports in a single bulk counter polling API, which can cause time-sensitive counter groups polling miss deadline if both counter groups compete a critical section in vendor's SAI/SDK. + To address the issue, the bulk counter polling can be split into smaller chunk sizes. Furthermore, different counters within a same counter group can be split into different chunk sizes. + + By doing so, all the counters of all ports will still be polled in each interval but it will be done by a lot of smaller bulk counter polling API calls, which makes it faster and the time-sensitive counter group have more chance to be scheduled on time. +- Provide an accurate timestamp when counters are polled. + + Currently, the timestamps are collected in the Lua plugin for time-sensitive counter groups, like PFC watchdog. However, there can be a gap between the time when the counters were polled and the timestamps were collected. + We can collect timestamps immediately after polling counters in sairedis and push them into the COUNTER_DB. ### Architecture Design -For each counter group, different statistic type is allowed to choose bulk or non-bulk API based on vendor SAI implementation. +For each counter group + +1. different statistic type is allowed to choose bulk or non-bulk API based on vendor SAI implementation. +2. bulk chunk size can be configure for the group of a set of counters in the group -![architecture](/doc/bulk_counter/bulk_counter.svg). +![architecture](bulk_counter.svg). -> Note: In the picture, pg/queue watermark statistic use bulk API and buffer watermark statistic uses non-bulk API. This is just an example to show the design idea. +> Note: In the picture, +> 1. PG/queue watermark statistic use bulk API and buffer watermark statistic uses non-bulk API. +> 2. Ports statistic counters are split into smaller chunks: IF_OUT_QLEN counter is polled for all ports and the rest counters are polled for each 32-port group +> 3. This is just an example to show the design idea. ### High-Level Design -Changes shall be made to sonic-sairedis to support this feature. No CLI change. No DB schema change. +Changes shall be made to sonic-sairedis to support this feature. No CLI change. > Note: Code present in this design document is only for demonstrating the design idea, it is not production code. @@ -73,6 +90,8 @@ struct BulkStatsContext std::vector counter_ids; std::vector object_statuses; std::vector counters; + std::string name; + uint32_t default_bulk_chunk_size; }; ``` - object_type: object type. @@ -81,6 +100,8 @@ struct BulkStatsContext - counter_ids: SAI statistic IDs that will be queried/cleared by the bulk call. - object_statuses: SAI bulk API return value for each object. - counters: counter values that will be fill by vendor SAI. +- name: name of the context for pushing accurate timestamp into the COUNTER_DB. +- default_bulk_chunk_size: the bulk chunk size of this context. The flow of how to updating bulk context will be discussed in following section. @@ -92,17 +113,47 @@ std::map, BulkStatsContext> m_portBulkContexts; ``` +##### Set bulk chunk size for a counter group and per counter IDs + +The bulk chunk size can be configured for a counter group. Once configured, each bulk will poll counters of no more than the configured number of ports. + +Furthermore, the bulk chunk size can be configured on a per counter IDs set basis using string in format `:{,:}`. +Each `COUNTER_NAME_PREFIX` defines a set of counter IDs by matching the counter IDs with the prefix. All the counter IDs in each set share a unified bulk chunk size and will be polled in a series of bulk counter polling API calls with the same counter IDs set but different port set. +All such sets of counter IDs form a partition of counter IDs of the flex counter group. The partition of a flex counter group is represented by the keys of map `m_portBulkContexts`. + +To simplify the logic, it is not supported to change the partition, which means it does not allow to split counter IDs into a differet sub sets once they have been split. + +Eg. `SAI_PORT_STAT_IF_IN_FEC:32,SAI_PORT_STAT_IF_OUT_QLEN:0` represents + +1. the bulk chunk size of all counter IDs starting with prefix `SAI_PORT_STAT_IF_IN_FEC` is 32 +2. the bulk chunk size of counter `SAI_PORT_STAT_IF_OUT_QLEN` is 0, which mean 1 bulk will fetch the counter of all ports +3. the bulk chunk size of rest counter IDs is the counter group's bulk chunk size. + +The counter IDs will be split to a partition which consists of a group of sub sets `{{all FEC counters starting with SAI_PORT_STAT_IF_IN_FEC}, {SAI_PORT_STAT_IF_OUT_QLEN}, {the rest counters}}`. +The counter IDs in each sub set share the unified bulk chunk size and will be poll together. + +In the above example, once the bulk chunk size is set in the way, a customer can only changes the bulk size of each set but can not change the way the sub sets are split. Eg. + +1. `SAI_PORT_STAT_IF_IN_FEC:16,SAI_PORT_STAT_IF_OUT_QLEN:0` can be used to set the bulk chunk size to 16 and 0 for of all FEC counters and counter `SAI_PORT_STAT_IF_OUT_QLEN` respectively. +2. `SAI_PORT_STAT_IF_IN_FEC:16,SAI_PORT_STAT_IF_OUT_QLEN:0,SAI_PORT_STAT_ETHER_STATS:64` is not supported because it changes the partition. + ##### Update Bulk Context 1. New object join counter group. -![Add Object Flow](/doc/bulk_counter/object_join_counter_group.svg). +![Add Object Flow](object_join_counter_group.svg). 2. Existing object leave counter group, related data shall be removed from bulk context. +![Remove Object Flow](object_leave_counter_group.svg). + +3. A customer split the chunk size of bulk counter polling to different smaller sizes per counter IDs. + +![Set chunk size per counter ID](set_chunk_size_per_counter_ID.svg). + ##### Statistic Collect -![Collect Counter Flow](/doc/bulk_counter/counter_collect.svg). +![Collect Counter Flow](counter_collect.svg). ### SAI API @@ -113,7 +164,45 @@ SAI APIs shall be used in this feature: ### Configuration and management -N/A +#### YANG model Enhancements + +##### Yang model of flex counter group + +The following new types will be introduced in `container FLEX_COUNTER_TABLE` of the flex counter group + +``` + container sonic-flex_counter { + container FLEX_COUNTER_TABLE { + + typedef bulk_chunk_size { + type uint32 { + range 0..4294967295; + } + } + + typedef bulk_chunk_size_per_prefix { + type string; + description "Bulk chunk size per counter name prefix"; + } + + } + } +``` + +In the yang model, each flex counter group is an independent countainer. We will define leaf in the countainer `PG_DROP`, `PG_WATERMARK`, `PORT`, `QUEUE`, `QUEUE_WATERMARK`. +The update of `PG_DROP` is shown as below + +``` + container PG_DROP { + /* PG_DROP_STAT_COUNTER_FLEX_COUNTER_GROUP */ + leaf BULK_CHUNK_SIZE { + type bulk_chunk_size; + } + leaf BULK_CHUNK_SIZE_PER_PREFIX { + type bulk_chunk_size_per_prefix; + } + } +``` ### Warmboot and Fastboot Design Impact @@ -150,3 +239,15 @@ As this feature does not introduce any new function, unit test shall be good eno - support bulk with different counter IDs - support bulk -> not support bulk - not support bulk but counter IDs change + +### Appendix + +#### An example shows how smaller bulk chunk size helps + +![Smaller bulk chunk size](smaller_chunk_size.svg) + +An example shows how smaller bulk chunk size helps PFC watchdog counter polling thread to be scheduled in time. + +In the upper chart, the port counters are polled in a single bulk call which takes longer time. The PFC watchdog counter polling thread can not procceed until the long bulk call exits the critical section. + +In the lower chart, the port counters are polled in a series of bulk call with smaller bulk chunk sizes. The PFC watchdog counter polling thread has more chance to be scheduled in time. diff --git a/doc/bulk_counter/bulk_counter.svg b/doc/bulk_counter/bulk_counter.svg index 26023979c9..a3eeccf453 100644 --- a/doc/bulk_counter/bulk_counter.svg +++ b/doc/bulk_counter/bulk_counter.svg @@ -1 +1,4 @@ -
Syncd
Syncd
Counter Thread 1
Counter Thread 1
pg watermark
pg watermark
queue watermark
queue watermark
buffer watermark
buffer watermark
Vendor SAI
Vendor SAI
Counter Thread 2
Counter Thread 2
...
...
Counter Thread n
Counter Thread n
...
...
...
...
bulk api
bulk api
bulk api
bulk api
non bulk api
non bulk api
...
...
Viewer does not support full SVG 1.1
\ No newline at end of file + + + +
Syncd
Syncd
Counter Thread 1
Counter Thread 1
pg watermark
pg watermark
queue watermark
queue watermark
buffer watermark
buffer watermark
Vendor SAI
Vendor SAI
Counter Thread 2
Counter Thread 2
Counter Thread n
Counter Thread n
...
...
...
...
bulk api
bulk api
bulk api
bulk api
non bulk api
non bulk api
...
...
port counter
port counter
bulk api on all ports for IF_OUT_QLEN counter
bulk api on all ports fo...
bulk api on ports 1~32 for rest counters
bulk api on ports 1~3...
bulk api on ports 33~64 for rest counters
bulk api on ports 33~64...
bulk api on ports 64~80 for rest counters
bulk api on ports 64~...
Text is not SVG - cannot display
\ No newline at end of file diff --git a/doc/bulk_counter/counter_collect.svg b/doc/bulk_counter/counter_collect.svg index cc167c4fda..f824020aeb 100644 --- a/doc/bulk_counter/counter_collect.svg +++ b/doc/bulk_counter/counter_collect.svg @@ -1 +1,4 @@ -
For each bulk context
For each bulk context
Call sai_object_bulk_get_stats
Call sai_object_bulk...
Success?
Success?
Fill Counters DB
Fill Counters DB
yes
yes
Log warning
Log warning
For each item in object_statuses
For each item in obj...
Success?
Success?
Log error
Log error
yes
yes
no
no
no
no
Viewer does not support full SVG 1.1
\ No newline at end of file + + + +
For each bulk context
For each bulk context
fetch next chunk of ports to port set
fetch next chunk of...
Success?
Success?
Fill Counters DB
with time stamp
Fill Counters DB...
yes
yes
Log warning
Log warning
For each item in object_statuses
For each item in obj...
Success?
Success?
Log error
Log error
yes
yes
no
no
no
no
number of left ports
bulk chunk size
number of left ports...
fetch all ports to port set
fetch all ports to port...
All port handled
All port handled
no
no
sai_object_bulk_get_stats
on the port set
sai_object_bulk_get_sta...
yes
yes
no
no
Text is not SVG - cannot display
\ No newline at end of file diff --git a/doc/bulk_counter/object_join_counter_group.svg b/doc/bulk_counter/object_join_counter_group.svg index ce5475b421..b2e6b85934 100644 --- a/doc/bulk_counter/object_join_counter_group.svg +++ b/doc/bulk_counter/object_join_counter_group.svg @@ -1 +1,4 @@ -
New entry to FlexCounter table
New entry to FlexCou...
Get supported counter IDs
Get supported counte...
Extract object ID and counter IDs
Extract object ID an...
FlexCounter::addCounter
FlexCounter::addCoun...
Success?
Success?
All Counter IDs support bulk?
All Counter IDs support b...
Bulk not supported
Bulk not supported
yes
yes
no
no
yes
yes
no
no
Find existing bulk context?
Find existing bulk contex...
Update bulk context
Update bulk context
yes
yes
no
no
Create bulk context
Create bulk context
Viewer does not support full SVG 1.1
\ No newline at end of file + + + +
New entry to FlexCounter table
New entry to FlexCou...
Get supported counter IDs
Get supported counte...
Extract object ID and counter IDs
Extract object ID an...
FlexCounter::addCounter
FlexCounter::addCoun...
Success?
Success?
All Counter IDs support bulk?
All Counter IDs su...
Bulk not supported
Bulk not supported
No
No
no
no
yes
yes
no
no
Found existing bulk context?
Found existing bulk cont...
Update bulk context
Update bulk context
yes
yes
no
no
Create bulk context
Create bulk context
yes
yes
Is bulk chunk size set per counter ID?
Is bulk chunk size...
are all subsets of counter IDs handled
are all subsets of co...
split all counter IDs into different subsets
split all counter ID...
fetch the first subset of counter IDs
fetch the first subs...
yes
yes
is bulk context exist for the subset
is bulk context ex...
yes
yes
Update bulk context
Update bulk context
Create bulk context
Create bulk context
no
no
Finish
Finish
no
no
yes
yes
fetch the next subset of counter IDs
fetch the next subse...
Text is not SVG - cannot display
\ No newline at end of file diff --git a/doc/bulk_counter/object_leave_counter_group.svg b/doc/bulk_counter/object_leave_counter_group.svg new file mode 100644 index 0000000000..405ae59265 --- /dev/null +++ b/doc/bulk_counter/object_leave_counter_group.svg @@ -0,0 +1,4 @@ + + + +
Existing entry leave FlexCounter table
Existing entry leave...
Fetch first bulk context
Fetch first bulk context
Extract object ID
Extract object ID
FlexCounter::removeCounter
FlexCounter::removeCounter
Does object ID exist
in the the bulk context
Does object ID exist...
no
no
yes
yes
Remove the object from the bulk context
Remove the object fr...
Finish
Finish
Fetch next bulk context
Fetch next bulk context
Have all bulk context
 been handled?
Have all bulk context...
no
no
Yes
Yes
Text is not SVG - cannot display
\ No newline at end of file diff --git a/doc/bulk_counter/set_chunk_size_per_counter_ID.svg b/doc/bulk_counter/set_chunk_size_per_counter_ID.svg new file mode 100644 index 0000000000..499414fd13 --- /dev/null +++ b/doc/bulk_counter/set_chunk_size_per_counter_ID.svg @@ -0,0 +1,4 @@ + + + +
New chunk size per counter ID prefix
New chunk size per c...
Is the partition
changed?
Is the partition...
No
No
no
no
No
No
no
no
Have all the
counters handled
Have all the...
Update bulk chunk size for each bulk context
Update bulk chunk size...
yes
yes
Does subset of
the prefix exist
Does subset of...
Fetch first counter in the bulk context
Fetch first counter...
Fetch first counter prefix
Fetch first counter...
Does counter
match the prefix?
Does counter...
yes
yes
no
no
Finish
Finish
no
no
yes
yes
Is there any bulk context
Is there any bulk context
finish
finish
yes
yes
Only one
sub set containing all counter IDs
in the partition
Only one...
yes
yes
yes
yes
Unsupported
Unsupported
Create a new subset
Create a new subset
Add the counter ID to the subset
Add the counter ID t...
yes
yes
Have all prefixes handled?
Have all prefixes han...
Fetch next counter prefix
Fetch next counter p...
Fetch next counter in the bulk context
Fetch next counter i...
no
no
set bulk chunk size of the new subset
set bulk chunk size...
Create bulk context for each sub set
with counter ID list and bulk chunk size as args
Create bulk context for each sub set...
Does the counter ID
match any prefix
Does the counter ID...
yes
yes
Add the counter ID to the default subset
Add the counter ID t...
no
no
Text is not SVG - cannot display
\ No newline at end of file diff --git a/doc/bulk_counter/smaller_chunk_size.svg b/doc/bulk_counter/smaller_chunk_size.svg new file mode 100644 index 0000000000..0220369417 --- /dev/null +++ b/doc/bulk_counter/smaller_chunk_size.svg @@ -0,0 +1,4 @@ + + + +
Threads
Threads
Time
Time
Port counter
polling thread
Port counter...
PFC watchdog
polling thread
PFC watchdog...
Threads
Threads
Time
Time
Port counter
polling thread
Port counter...
PFC watchdog
polling thread
PFC watchdog...
PFC watchdog is pended for longer time when using a single bulk counter polling API to poll all port counters
PFC watchdog is pended for longer time when using a single bulk counter polling API to poll all port counters
t0
t0
t1
t1
t2
t2
t3
t3
t4
t4
t5
t5
t6
t6
t7
t7
t0
t0
t1
t1
t2
t2
t4
t4
t6
t6
t5
t5
t7
t7
PFC watchdog is pended for much shorter time when using a series of bulk counter polling API
PFC watchdog is pended for much shorter time when using a series of bulk counter polling API
thread pends on timer waiting for its interval
thread pends on timer waiting for its inter...
thread runs outside the critical section
thread runs outside the critical sec...
thread runs inside the critical section
thread runs inside the critical sect...
thread pends on the critical section
thread pends on the critical section
t3
t3
t8
t8
t0 Port counter polling thread is scheduled to run
t0 Port counter polling thread is scheduled to...
t1 Port counter polling thread entries the critical section
t1 Port counter polling thread entries the critical s...
t2 PFC watchdog polling thread is scheduled to run
t2 PFC watchdog polling thread is scheduled to run
t3 PFC watchdog polling thread pends on the critical section
t3 PFC watchdog polling thread pends on the critical sec...
t4 Port counter polling thread exits the critical section
PFC watchdog counter polling thread entries the critical section
t4 Port counter polling thread exits the critical section...
t5 Port counter polling thread finishes the current period
t5 Port counter polling thread finishes the current p...
t6 PFC watchdog polling thread exits the critical section
t6 PFC watchdog polling thread exits the critical sec...
t7 PFC watchdog polling thread finishes the current period
t7 PFC watchdog polling thread finishes the current per...
PFC watchdog polling thread finish time: t7 - t2
PFC watchdog polling thread finish time: t7 -...
Port counter polling thread finish time: t5 - t0
Port counter polling thread finish time: t5...
t0 Port counter polling thread is scheduled to run
t0 Port counter polling thread is scheduled to...
t1 Port counter polling thread entries the critical section
t1 Port counter polling thread entries the critical s...
t2 PFC watchdog polling thread is scheduled to run
t2 PFC watchdog polling thread is scheduled to run
t3 PFC watchdog polling thread pends on the critical section
t3 PFC watchdog polling thread pends on the critical sec...
t4 Port counter polling thread exits the critical section
PFC watchdog counter polling thread entries the critical section
t4 Port counter polling thread exits the critical section...
t5 Port counter polling thread pends on the critical section
t5 Port counter polling thread pends on the critical se...
t6 PFC watchdog polling thread exits the critical section
Port counter polling thread entries the critical section
t6 PFC watchdog polling thread exits the critical sec...
t7 PFC watchdog polling thread finishes the current period
t7 PFC watchdog polling thread finishes the current per...
t8 Port counter polling thread finishes the current period
t8 Port counter polling thread finishes the current p...
PFC watchdog polling thread finish time: t7 - t2,
 much shorter
PFC watchdog polling thread finish time: t7 -...
Port counter polling thread finish time: t8 - t0
Port counter polling thread finish time: t8...
Legend
Legend
Text is not SVG - cannot display
\ No newline at end of file