Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QC-1248 Document QO->Flag conversion #2481

Merged
merged 3 commits into from
Nov 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ For a general overview of our (O2) software, organization and processes, please
* [Check](doc/ModulesDevelopment.md#check)
* [Configuration](doc/ModulesDevelopment.md#configuration)
* [Implementation](doc/ModulesDevelopment.md#implementation)
* [Results](doc/ModulesDevelopment.md#results)
* [Quality Aggregation](doc/ModulesDevelopment.md#quality-aggregation)
* [Quick try](doc/ModulesDevelopment.md#quick-try)
* [Configuration](doc/ModulesDevelopment.md#configuration-1)
Expand Down Expand Up @@ -76,6 +77,8 @@ For a general overview of our (O2) software, organization and processes, please
* [Critical, resilient and non-critical tasks](doc/Advanced.md#critical-resilient-and-non-critical-tasks)
* [QC with DPL Analysis](doc/Advanced.md#qc-with-dpl-analysis)
* [Uploading objects to QCDB](doc/Advanced.md#uploading-objects-to-qcdb)
* [Propagating Check results to RCT in Bookkeeping](doc/Advanced.md#propagating-check-results-to-rct-in-bookkeeping)
* [Conversion details](doc/Advanced.md#conversion-details)
* [Solving performance issues](doc/Advanced.md#solving-performance-issues)
* [Dispatcher](doc/Advanced.md#dispatcher)
* [QC Tasks](doc/Advanced.md#qc-tasks-1)
Expand Down
68 changes: 68 additions & 0 deletions doc/Advanced.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ Advanced topics
* [Critical, resilient and non-critical tasks](#critical-resilient-and-non-critical-tasks)
* [QC with DPL Analysis](#qc-with-dpl-analysis)
* [Uploading objects to QCDB](#uploading-objects-to-qcdb)
* [Propagating Check results to RCT in Bookkeeping](#propagating-check-results-to-rct-in-bookkeeping)
* [Conversion details](#conversion-details)
* [Solving performance issues](#solving-performance-issues)
* [Dispatcher](#dispatcher)
* [QC Tasks](#qc-tasks-1)
Expand Down Expand Up @@ -585,6 +587,72 @@ the directories listed in the logs:
Notice that by default the executable will ignore the directory structure in the input file and upload all objects to one directory.
If you need the directory structure preserved, add the argument `--preserve-directories`.

## Propagating Check results to RCT in Bookkeeping

The framework allows to propagate Quality Objects (QOs) produced by Checks and Aggregators to RCT in Bookkeeping.
The synchronisation is done once, at the end of workflow runtime, i.e. at the End of Run or in the last stage of QC merging on Grid.
Propagation can be enabled by adding the following key-value pair to Check/Aggregator configuration:
```json
"exportToBookkeeping": "true"
```
Using it for Aggregators is discouraged, as the information on which exact Check failed is lost or at least obfuscated.

Check results are converted into Flags, which are documented in [O2/DataFormats/QualityControl](https://github.com/AliceO2Group/AliceO2/tree/dev/DataFormats/QualityControl).
Information about the object validity is preserved, which allows for time-based flagging of good/bad data.

### Conversion details

Below we describe some details of how the conversion is done.
Good QOs are marked with green, Medium QOs are marked with orange and Bad QOs are marked with red.
Null QOs are marked with purple.

- **Good QOs with no Flags associated are not converted to any Flags.**
According to the preliminary design for Data Tagging, "bad" Flags always win, thus there is no need for explicit "good" Flags.
It also implies that there is no need to explicitly add Good Flag to Good Quality.

![](images/qo_flag_conversion_01.svg)

- **Bad and Medium QOs with no Flags are converted to Flag 14 (Unknown).**
This means that Medium Quality data is by default bad for Analysis.

![](images/qo_flag_conversion_02.svg)

- **Null QOs with no Flags are converted to Flag 1 (Unknown Quality).**

![](images/qo_flag_conversion_03.svg)

- **All QOs with Flags are converted to Flags, while the Quality is ignored.**
As a consequence, one can customize the meaning of any Quality (Medium in particular) in terms of data usability.
A warning is printed if a Check associates a good Flag to bad Quality or a bad Flag to good Quality.

![](images/qo_flag_conversion_04.svg)

- **Timespans not covered by a given QO are filled with Flag 1 (Unknown Quality).**
In other words, if an object was missing during a part of the run, we can state that the data quality is not known.

![](images/qo_flag_conversion_05.svg)

- **Overlapping or adjacent Flags with the same ID, comment and source (QO name) are merged.**.
This happens even if they were associated with different Qualities, e.g. Bad and Medium.
Order of Flag arrival does not matter.

![](images/qo_flag_conversion_06.svg)
![](images/qo_flag_conversion_07.svg)

- **Flag 1 (Unknown Quality) is overwritten by any other Flag.**
This allows us to return Null Quality when there is not enough statistics to determine data quality, but it can be suppressed later, once we can return Good/Medium/Bad.

![](images/qo_flag_conversion_08.svg)

- **Good and Bad flags do not affect each other, they may coexist.**

![](images/qo_flag_conversion_09.svg)

- **Flags for different QOs (QO names) do not affect each other.
Flag 1 (Unknown Quality) is added separately for each.**

![](images/qo_flag_conversion_10.svg)

# Solving performance issues

Problems with performance in message passing systems like QC usually manifest in backpressure seen in input channels of processes which are too slow.
Expand Down
16 changes: 14 additions & 2 deletions doc/ModulesDevelopment.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
* [Check](#check)
* [Configuration](#configuration)
* [Implementation](#implementation)
* [Results](#results)
* [Quality Aggregation](#quality-aggregation)
* [Quick try](#quick-try)
* [Configuration](#configuration-1)
Expand Down Expand Up @@ -300,7 +301,8 @@ A Check is a function (actually `Check::check()`) that determines the quality of
"type": "Task",
"name": "QcTask",
"MOs": ["example", "other"]
}]
}],
"exportToBookkeeping": "false"
},
"QcCheck": {
...
Expand All @@ -323,6 +325,7 @@ A Check is a function (actually `Check::check()`) that determines the quality of
* _type_ - currently only supported are _Task_ and _ExternalTask_
* _name_ - name of the _Task_
* _MOs_ - list of MonitorObjects names or can be omitted to mean that all objects should be taken.
* __exportToBookkeeping__ - allows to propagate the results of this Check to Bookkeeping, where they are visualized as time-based Flags (disabled by default).

### Implementation
After the creation of the module described in the above section, every Check functionality requires a separate implementation. The module might implement several Check classes.
Expand All @@ -333,12 +336,21 @@ void beautify(std::shared_ptr<MonitorObject> mo, Quality = Quality::Null) {}

```

The `check()` function is called whenever the _policy_ is satisfied. It gets a map with all declared MonitorObjects. It is expected to return Quality of the given MonitorObjects.
The `check()` function is called whenever the _policy_ is satisfied. It gets a map with all declared MonitorObjects.
It is expected to return Quality of the given MonitorObjects.
Optionally one can associate one or more Flags to a Quality by using `addFlag` on it.

For each MO or group of MOs, `beautify()` is invoked after `check()` if
1. the check() did not raise an exception
2. there is a single `dataSource` in the configuration of the check

### Results

Checks return Qualities with associated Flags.
The framework wraps them with a QualityObject, then makes it available to Aggregators (see the next section) and stores them in the repository.
It is also possible to propagate Check results to the Run Condition Table (RCT) in Bookkeeping.
Details are explained at [Propagating Check results to RCT in Bookkeeping](Advanced.md#propagating-check-results-to-rct-in-bookkeeping)

## Quality Aggregation

The _Aggregators_ are able to collect the QualityObjects produced by the checks or other _Aggregators_ and to produce new Qualities. This is especially useful to determine the overall quality of a detector or a set of detectors.
Expand Down
Loading
Loading