Job metrics retrieval #2031

LuiggiTenorioK · 2024-12-16T10:07:16Z

Purpose: Give the user instant access to user-defined job metrics they want to track when a job finishes in the workflow

Description: As discussed in BSC-ES/autosubmit-api#75, the idea is to give the users a way to retrieve metrics that are calculated during the execution of their workflow. In this line, the API should give the interface for the user and Autosubmit should handle the transfer of the files from remote to local.

Document with some of the metrics: https://docs.google.com/document/d/12yWDwXsohf4G4MPeP6e3Eil4ZL-YeIN71dBcoWRliEg/edit

Requirements

Autosubmit should get the output file specification from the YAML files
Autosubmit should get the metric depending on the selector that the user defines (directly from a text file or JSON with a key selector)
Autosubmit should store the metric from the output file in the DDBB
Autosubmit should read and store when the job finishes

Acceptance Criteria

Autosubmit correctly identifies the output files, their paths, and how to read the metric value
The metric in the DDBB should be the one expected by the user-defined specification
The metric should be updated once the job finishes

Related issue: BSC-ES/autosubmit-api#75

LuiggiTenorioK · 2024-12-16T10:09:57Z

@mcastril I updated the requirements and acceptance criteria based on what we discussed last Thursday. Feel free to modify it or confirm that we can close the scope of this new feature.

LuiggiTenorioK · 2024-12-16T10:30:59Z

Following the possible design, here is an example of the user flow:

The user will have to define their metrics in the JOBS section like this:

JOBS:
  SETUP:
    METRICS:
      - NAME: model
        PATH: /remote_folder/model.txt
  SIM:
    METRICS:
      - NAME: custom_metric_1
        PATH: /remote_folder/metrics.json
        SELECTOR:
          TYPE: JSON   # Default is TEXT
          KEY: METRIC_1
      - NAME: custom_metric_2
        PATH: /remote_folder/metrics.json
        SELECTOR:
          TYPE: JSON
          KEY: MISC.METRIC_2

Then, it will be expected that once the jobs are finished the /remote_folder/ will have 2 files:

model.txt, e.g.:

ICON

metrics.json, e.g.:

{
  "METRIC_1": "HIGH",
  "MISC": {
    "METRIC_2": 640.28
  }
}

Later, the DDBB table should look like this:

job_name	metric_name	metric_value
<job_name_prefix>_SETUP	model	ICON
<job_name_prefix>_SIM	custom_metric_1	HIGH
<job_name_prefix>_SIM	custom_metric_2	640.28

At this point, it might be feasible to add the run_id in the column to versioning the metrics by run, since that information could be available in the retrieval.

LuiggiTenorioK self-assigned this Dec 16, 2024

LuiggiTenorioK added the new feature Use this label to plan and request new features label Dec 16, 2024

LuiggiTenorioK linked a pull request Dec 17, 2024 that will close this issue

Feature: Job metrics #2036

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Job metrics retrieval #2031

Job metrics retrieval #2031

LuiggiTenorioK commented Dec 16, 2024 •

edited

Loading

LuiggiTenorioK commented Dec 16, 2024

LuiggiTenorioK commented Dec 16, 2024 •

edited

Loading

Job metrics retrieval #2031

Job metrics retrieval #2031

Comments

LuiggiTenorioK commented Dec 16, 2024 • edited Loading

Requirements

Acceptance Criteria

LuiggiTenorioK commented Dec 16, 2024

LuiggiTenorioK commented Dec 16, 2024 • edited Loading

LuiggiTenorioK commented Dec 16, 2024 •

edited

Loading

LuiggiTenorioK commented Dec 16, 2024 •

edited

Loading