Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added example specification files #4

Closed
wants to merge 1 commit into from
Closed

Conversation

eacharles
Copy link
Collaborator

examples/config_dc2_test_med.yaml is the top-level specification file and it calls out all the other ones.

examples/notes.txt shows some thoughts on how we might use these to build and run campaigns

@fritzm fritzm force-pushed the tickets/DM-41174 branch 2 times, most recently from e24a21d to 39ce381 Compare October 23, 2023 02:44
Copy link
Member

@ctslater ctslater left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few comments on naming and clarifying the schema. Functionally it looks good.

# Initial setup of the campaign template. This only happen once per campaign type

# Create a production as a namespace
cm-service add production --name dc2_test_med
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the user interacts with something like cm-client, and the service is something running elsewhere.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, we should change the name of the client-using cli accordingly.

cm-service add production --name dc2_test_med

# Load the related specification.
cm-service load specification --yaml_file examples/config_dc2_test_med.yaml --production_name dc2_test_med --spec_name v0 --set-as-default
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think we need a name for the thing referred to here as yaml_file; campaign config? production config? something template?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. What we call it is going to depend a bit on the relationship between the block in the specification files, thing "glue" class that gathers then, and campaigns.

child_config:
base_query: "instrument='LSSTCam-imSim' and skymap='DC2' and tract in (3828, 3829)"
# Campaign level paramters
data:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be more specific, maybe handler_parameters:? extra_handler_parameters? What is the relationship between this and child_config entries?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, do you mean the "data:" block? If so, they are similar, but the idea is that the "data" block contains things that are applicable all the way down the hierarchy, whereas the child_config is only applicable to the next level down.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And yes, 'parameters' or 'heritage' or 'parameters_passed_to_children' would be move precise.



# Import the common script templates
- import: "${CM_CONFIGS}/config_standard_scripts.yaml"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there was a top-level dictionary element like CampaignTemplate, then it could have (dictionary representation just for simplicity):
CampaignTemplate: {"imports": [], "blocks": []}
That would make it so every list contained a consistent type of element.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand how that would work with multiple levels of imports, or would the idea be not to have multiple levels of imports.

Copy link
Member

@ctslater ctslater Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what multiple levels of imports refers to.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file 1 imports file 2 which imports file 3...
hard to do that if the imports are defined in some sort of top-level

root: 'cm/dc2_test_med'
# Define the steps, their connections and override parameters as needed
child_config:
step1:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 'step1' here a reference or is it defining a name?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reference, it could be
- step:
spec_block: step1
otherstuff: stuff
- step:
spec_block: step2
...

child_config:
step1:
spec_block: dc2_step1
child_config:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the schema of child_config; some of the child elements are whole steps and some are kwarg-like parameters? Those seem like different enough purposes that it might be clearer if they were separate dictionaries?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the hierarchy is pretty complicated, if you have a sense of how to split it that might be clearer, we could have a go.

@@ -0,0 +1,9 @@
includeConfigs:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks familiar!

# setup LSST env.
export WEEKLY='{lsst_version}'
source /cvmfs/sw.lsst.eu/linux-x86_64/lsst_distrib/${WEEKLY}/loadLSST.bash
setup lsst_distrib
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does manifest checking happen in this script or somewhere else?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the wrapper that sets up the env for that script. That script gets defined by the ManfestReportScriptHandler, here:

class ManifestReportScriptHandler(ScriptHandler):

includes: ['step']
data:
pipeline_yaml: "${DRP_PIPE_DIR}/pipelines/LSSTCam-imSim/DRP-test-med-1.yaml#step1"
child_config:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see here that the groups are defined as the child_config of the step, and all the group information is below.

data:
pipeline_yaml: "${DRP_PIPE_DIR}/pipelines/LSSTCam-imSim/DRP-test-med-1.yaml#step1"
child_config:
spec_block: group
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between SpecBlock and spec_block ? I'm concerned about confusion in editing these configs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SpecBlock defines a thing, spec_block refers that thing by name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it perhaps be a little more clear to call spec_block something like parent_spec_block? Or am I misinterpreting the use of it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parent is a little overloaded here in that it also refers to the parent node in the hierarchy. Here and elsewhere spec_block is refering to "the SpecBlock used to build that child".

- SpecBlock:
name: dc2_step1
includes: ['step']
data:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of category is data here? I see pipeline_yaml but what other information might we want in data ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything that gets passed down the heirarchy that isn't collection name templates (that goes in collections), configuration of how to build child nodes (that goes in child_config) or renaming what spec blocks to use (that goes in spec_aliases)

prerequisites: ['bps_report']
collections:
run: "{job_run}"
data:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does rescue: false also fall under data ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could chose to define a job that has rescue: true, which would use the same template, but just override that one parameter,

prerequisites: ['step1']
child_config:
base_query: "instrument='LSSTCam-imSim' and skymap='DC2' and tract in (3828, 3829)"
step3:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file all looks pretty familiar. I think it's decently straightforward. It's not clear what makes some things fall under child_config. I'm getting the impression that all the spec_block s here are linked config_dc2_steps.yaml which seems good. What does the linking/how does that work?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

child_config is pretty much anything that you need to know how to make the child nodes.

the linking happens when you make the child nodes, as part of that process you look up which spec blocks to use as templates for each child node. You can see an example of that around here:

for script_item in spec_block.scripts:

@eacharles
Copy link
Collaborator Author

This has been overtaken by events and the comments have been incorporated into the design of the specification files.

@eacharles eacharles closed this Jan 18, 2024
@eacharles eacharles deleted the tickets/DM-41174 branch May 28, 2024 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants