You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add frequently used Tag Infos as strongly-typed fields or as part of a bit-field mask
In GR4 we use Tag as a standardised form to annotate streams, indicate trigger, context changes, or settings changes that should be downstream propagated within a graph. The initial design assumption was that Tags are rare compared to the payload stream frequency they are attached to (e.g. one tag every ~1024 samples or more). Pushing the envelope a bit further and publishing, for example (worst case), a tag on every float sample, we noticed that the event handling rate goes down to ~300 kMsgs/s. Semën Lebedev wrote a new performance monitor block that can be inserted anywhere in the graph and track the throughput performance and memory consumption: #358
For the typical use case, the given Tag[Source, Monitor, Sink] graph topology (blocks are non-trivial) can be run up to about 500 MS/s (and Tags roughly every 1k samples) but drops down to about 30 kMsg/s when there is a tag attached to every sample. This is probably acceptable for most real-world use cases. Still, we felt that this could be further improved as the main bottleneck turned out to be the access of Tag meta information in the property_map (which internally is a std::map<std::string, pmt::pmtv>) and often just to check if there is an end-of-stream (EoS) tag in the stream. The frequent map accesses make the processing slow.
Thus, we tinkered and played with the idea of modifying the existing Tagdefinition from:
structalignas(hardware_constructive_interference_size) Tag {
using signed_index_type = std::make_signed_t<std::size_t>;
signed_index_type index{0};
property_map map{};
// ...
};
to a slightly larger structure where certain common fields and key-value pairs are hard-coded, namely:
using property_map = std::map<std::string, pmt::pmtv>;
inlineconstexpr std::size_tkTagEndOfStream = 1UZ;
inlineconstexpr std::size_tkTagTrigger = 2UZ;
inlineconstexpr std::size_tkTagEmptyMap = 4UZ;
inlineconstexpr std::size_tkTagContext = 8UZ;
inlineconstexpr std::size_tkTagAutoForwardParameter = 16UZ;
// ...structalignas(std::hardware_constructive_interference_size) Tag { // 192 bytesusing index_type = std::size_t; // N.B. upcoming change from signed->unsigned (to fix UB)
index_type _index{0};
std::size_t mask; // new: bit-mask storing common states, definitions see above
std::uint64_t trigger_time; // new, was earlier part of map (parsing overhead)float trigger_offset; // new
pmt::pmtv trigger_name; // new alt: std::string
pmt::pmtv ctx; // new, was earlier part of map (parsing overhead)
property_map map{}; // as before
};
The rationale behind the proposed changes is to minimise the access times to the using property_map = std::map<std::string, pmt::pmtv>; that are non-negligible in case there is a tag, for example, on every sample. See for instance, https://quick-bench.com/q/P7GaIXJLAJfnhCUT_29J6J7S-LY
The other update is how to generalise the tag forwarding using meta-programming. We can provide default implementations for the first ones:
enumclassTagPropagationPolicy {
TPP_DONT = 0, /*!< Scheduler doesn't propagate tags from in- to output. The block itself is free to insert tags. */
TPP_ALL_TO_ALL = 1, /*!< Propagate tags from all in- to all outputs. The scheduler takes care of that. */
TPP_ONE_TO_ONE = 2, /*!< Propagate tags from n. input to n. output. Requires same number of in- and outputs */
TPP_CUSTOM = 3/*!< Like TPP_DONT, but signals the block, it should implement application-specific forwarding behaviour. */
};
But we hit too many corner cases for the last one. Meta-programming read: C++ only, and/or it would be hard to wrap/write reasonably performing Python wrappers for it. We can post the exact signature later, but in a nutshell, it's similar to the for-each-port method, but it provides both input and output streaming ports.
I hope the design above is reasonable. There are the following main items to be discussed:
While the bit-mask should be used sparingly (for future upgrade options), are there other high-frequency standard tag types that should be defined right away?
We suggest to also explicitly add the trigger tuple information for type-safety, access speed and to ensure that always all three variants are declared (we discovered that it's easy to miss/mistype/forget one of them).
Sub-question: trigger_name is presently a std::string but we thought of also allowing a pmt::pmtv type for a more economic, for example, int-based comparisons.
We suggest also explicitly hard-coding the context ctx as key, as this drives a lot of the multiplexing and settings updates. Similar open question to the trigger: std::string and/or pmt::pmtv.
The text was updated successfully, but these errors were encountered:
This Tag struct is copyable, movable, destructible, and explicitly cache-aligned (192 bytes on most 64-bit machines). The pmt::pmtv type is used for trigger_name and ctx to facilitate faster comparisons, typically leveraging integer-based operations, rather than minimizing the size of the Tag struct.
There is a test-branch implementing this feature.
Initial perf tests using the one-tag-per-sample didn't improve. The assumed tag-processing bottleneck may be somewhere else.
Will leave this open for some time for others to give it a try/diagnose the real bottlenecks and then close it otherwise.
Add frequently used
Tag
Infos as strongly-typed fields or as part of a bit-field maskIn GR4 we use
Tag
as a standardised form to annotate streams, indicate trigger, context changes, or settings changes that should be downstream propagated within a graph. The initial design assumption was thatTag
s are rare compared to the payload stream frequency they are attached to (e.g. one tag every ~1024 samples or more). Pushing the envelope a bit further and publishing, for example (worst case), a tag on every float sample, we noticed that the event handling rate goes down to ~300 kMsgs/s. Semën Lebedev wrote a new performance monitor block that can be inserted anywhere in the graph and track the throughput performance and memory consumption: #358For the typical use case, the given Tag[Source, Monitor, Sink] graph topology (blocks are non-trivial) can be run up to about 500 MS/s (and Tags roughly every 1k samples) but drops down to about 30 kMsg/s when there is a tag attached to every sample. This is probably acceptable for most real-world use cases. Still, we felt that this could be further improved as the main bottleneck turned out to be the access of Tag meta information in the
property_map
(which internally is astd::map<std::string, pmt::pmtv>
) and often just to check if there is an end-of-stream (EoS) tag in the stream. The frequent map accesses make the processing slow.Thus, we tinkered and played with the idea of modifying the existing
Tag
definition from:to a slightly larger structure where certain common fields and key-value pairs are hard-coded, namely:
The rationale behind the proposed changes is to minimise the access times to the
using property_map = std::map<std::string, pmt::pmtv>;
that are non-negligible in case there is a tag, for example, on every sample. See for instance, https://quick-bench.com/q/P7GaIXJLAJfnhCUT_29J6J7S-LYThe other update is how to generalise the tag forwarding using meta-programming. We can provide default implementations for the first ones:
But we hit too many corner cases for the last one. Meta-programming read: C++ only, and/or it would be hard to wrap/write reasonably performing Python wrappers for it. We can post the exact signature later, but in a nutshell, it's similar to the for-each-port method, but it provides both input and output streaming ports.
I hope the design above is reasonable. There are the following main items to be discussed:
trigger_name
is presently astd::string
but we thought of also allowing apmt::pmtv
type for a more economic, for example, int-based comparisons.ctx
as key, as this drives a lot of the multiplexing and settings updates. Similar open question to the trigger:std::string
and/orpmt::pmtv
.The text was updated successfully, but these errors were encountered: