Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: full_chain_test.py config with command-line options #3811

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

timadye
Copy link
Contributor

@timadye timadye commented Nov 3, 2024

full_chain_test.py supports all options from ckf_tracks.py (generic detector), full_chain_odd.py, and full_chain_itk.py, as well as some new additions to help with testing.

The idea is to complement those scripts which provide examples, where full_chain_test.py simplifies running interactive/batch tests with different options specified on the command-line. With all the options, it is not intended to be an easy-to-read example. The existing example scripts could eventually be simplified, since they no longer need to support those test options they do have.

The hope is that, in future, full_chain_test.py can be used to test new features and see the result in (at least) ODD and ITk environments.

The idea for this initial version is:

  • full_chain_test.py exactly matches full_chain_odd.py
    • ODD configuration is the default
  • full_chain_test.py -A -M1 -N2 exactly matches full_chain_itk.py
    • -A selects the ATLAS ITk configuration
    • -M1 -N2 (or --gen-nvertices 1 --gen-nparticles 2) changes to 2 particles per event from full_chain_odd.py's default of 800 (-M 200 -N 4)
    • full_chain_test.py uses writeCovMat=True, whereas full_chain_itk.py doesn't.
  • full_chain_test.py -G is similar ckf_tracks.py
    • -G selects the generic detector configuration
    • does not exactly match the more rudimentary ckf_tracks.py

Later, if this becomes a useful development test, we can harmonise the detail setup between ODD and ITk.

Also, do we need to support options from full_chain_itk_Gbts.py and full_chain_odd_LRT.py?

@github-actions github-actions bot added the Component - Examples Affects the Examples module label Nov 3, 2024
Copy link

github-actions bot commented Nov 3, 2024

📊: Physics performance monitoring for f7194cc

Full contents

physmon summary

Copy link

coderabbitai bot commented Nov 5, 2024

Important

Review skipped

Auto reviews are limited to specific labels.

🏷️ Labels to auto review (1)
  • coderabbit

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@timadye timadye marked this pull request as ready for review November 5, 2024 15:56
@andiwand
Copy link
Contributor

andiwand commented Nov 5, 2024

Thinking longer term I believe these full_chain_*.py should all go away and be replaced with a physmon workflow. If that is not possible (for example ITk is not part of ACTS) it should be completely removed IMO.

If we cannot validate the output of these chains they are just another unmonitored script. To try new algorithms and configurations one can modify a standard chain and if that workflow solidifies one can add a physmon workflow for this.

@timadye
Copy link
Contributor Author

timadye commented Nov 5, 2024

We discussed this at some length at the Developers' meeting, with a variety of opinions.

Thinking longer term I believe these full_chain_*.py should all go away and be replaced with a physmon workflow. If that is not possible (for example ITk is not part of ACTS) it should be completely removed IMO.

I think these serve different use cases from physmon, but also this new full_chain_test.py is different from the existing full_chain_*.py scripts. Maybe that could be clearer, eg. with a new name or location? Maybe the current location is best until we can simplify the other scripts.

full_chain_itk.py was initially intended as an example that people could use to help set up their own scripts, but it has developed into a test platform for new features, and rather obscures the simple use case. full_chain_odd.py also added command-line options.

full_chain_test.py is intended to take over the interactive testing uses from the other scripts. Later, we can simplify the other scripts back to proper examples.

There is a real need for a command to perform interactive testing with simple command-line options. The evidence being how these scripts keep popping up (and not just written by me 😄). That may not be the way everyone works, but many of us do. I used my private version of this script to do all the tests on your recent CKF PRs - and I hope that was useful. Having private versions of that script is really inconvenient as people have to develop them independently, and keep them up to date.

If we cannot validate the output of these chains they are just another unmonitored script. To try new algorithms and configurations one can modify a standard chain and if that workflow solidifies one can add a physmon workflow for this.

Since this is a developer testing script, we'll have to provide it as-is with best-effort support. If it breaks when testing, then you fix it when you add/fix the feature you are testing.

I am thinking of adding a big disclaimer to the top of the script and in the help text, eg.

This script is provided for interactive developer testing only. It is not intended (and not supported) for end user use, automated testing, and certainly should never be called in production. The Python API is the proper way to access the ActsExamples from scripts. The other Examples/Scripts are much better examples of how to do that. physmon in the CI is the proper way to do automated integration tests. This script is only for the case of interactive testing with one-off configuration specified by command-line options.

@andiwand
Copy link
Contributor

andiwand commented Nov 5, 2024

I think these serve different use cases from physmon, but also this new full_chain_test.py is different from the existing full_chain_*.py scripts. Maybe that could be clearer, eg. with a new name or location? Maybe the current location is best until we can simplify the other scripts.

I see that but I would argue that we should also delete all other full_chain_*.py in favor of this one. If this is supposed to be less confusing and better supported these should not be a second and third version of this.

There is a real need for a command to perform interactive testing with simple command-line options. The evidence being how these scripts keep popping up (and not just written by me 😄). That may not be the way everyone works, but many of us do. I used my private version of this script to do all the tests on your recent CKF PRs - and I hope that was useful. Having private versions of that script is really inconvenient as people have to develop them independently, and keep them up to date.

We had the case in the past already that people depended on the full_chain_*.py which changed unnoticed and broke their workflow silently only discovering it hours and days afterwards. I think the correct thing to do is to copy and edit the chain to the people's personal needs.


Overall I believe we are trying to oversimplify something which is inherently not simple and we just don't have the right abstraction for this right now. One reconstruction chain fulfills one purpose and we modify this by switching stuff in and out with numerous CLI args which cannot be tested and is ultimately bound to break.

IMO modifying a chain is easiest if it is not overloaded with 100 CLI flags so you can just drop things our and copy something in without wiring another flag that potentially conflicts with 10 others. So I would argue the proposed script is not a solution to give a template for later to be modified chains.

As for the interactive testing I can see that this is useful as you can simply play around with the reconstruction chain without knowing all the inner workings of ACTS and its Examples. But I would see this strictly for educational purposes or quick tests/checks for people with experience.

@timadye
Copy link
Contributor Author

timadye commented Nov 6, 2024

If I understand you correctly, I think you might be confusing the purpose of this script, and what I understood was the original purpose of the existing scripts that I hope we can eventually restore.

full_chain_odd.py etc were originally supposed to be simple examples that users can take and modify for their own purposes. They have grown with the addition of inline or command-line options to provide different tests. I would like to remove this test functionality into a single script (full_chain_test.py) and restore the other full_chain_*.py to simple examples. I thought to do the full_chain_odd.py etc simplification in another PR, assuming people were happy using full_chain_test.py.

full_chain_test.py is intended for interactive developer testing. This can be done at the moment with full_chain_odd.py, but not so easily with full_chain_itk.py.

IMO modifying a chain is easiest if it is not overloaded with 100 CLI flags so you can just drop things our and copy something in without wiring another flag that potentially conflicts with 10 others. So I would argue the proposed script is not a solution to give a template for later to be modified chains.

It's 32 flags at the moment, and most of them are generic flags for changing the simulated sample (eg. --events 10 --ttbar-pu200), the detector configuration (eg. --itk --bf-constant), the algorithms run (eg. --seeding-algorithm Orthogonal --no-ckf), and how the job is run (eg. `--output-dir test3 --loglevel DEBUG). None of these are for turning on and off particular details of the reconstruction, but rather setting up the particular test you want to run.

For most use cases, it would not be necessary to wire in another CLI flag to perform the test. But it is useful to test with different single particle momenta, or with ttbar PU200, to run with different numbers of events, to try with ODD or ITk, to send the output to different directories so as not to overwrite previous tests. Without CLI flags, doing more than one interactive test becomes difficult to manage.

Sometimes it might be useful to have a CLI flag to turn on and off a feature so one can run with and without the new feature. But that would be a temporary hack for that set of tests. I wondered whether it would be sometimes be useful to commit the new CLI flag to GitHub, but your comment convinces me that this would be a bad idea in the long term. So let's not do that.

@andiwand
Copy link
Contributor

andiwand commented Nov 6, 2024

If I understand you correctly, I think you might be confusing the purpose of this script, and what I understood was the original purpose of the existing scripts that I hope we can eventually restore.

full_chain_odd.py etc were originally supposed to be simple examples that users can take and modify for their own purposes. They have grown with the addition of inline or command-line options to provide different tests. I would like to remove this test functionality into a single script (full_chain_test.py) and restore the other full_chain_*.py to simple examples. I thought to do the full_chain_odd.py etc simplification in another PR, assuming people were happy using full_chain_test.py.

full_chain_test.py is intended for interactive developer testing. This can be done at the moment with full_chain_odd.py, but not so easily with full_chain_itk.py.

I see that but what I cannot see is how the situation is improved by keeping full_chain_odd.py and full_chain_itk.py as they are in this PR. There is just another chain with numerous CLI flags and nobody knows what to start with.

IMO to move forward with this the PR should also include the proposed changes to the other chains.

It's 32 flags at the moment, and most of them are generic flags for changing the simulated sample (eg. --events 10 --ttbar-pu200), the detector configuration (eg. --itk --bf-constant), the algorithms run (eg. --seeding-algorithm Orthogonal --no-ckf), and how the job is run (eg. `--output-dir test3 --loglevel DEBUG). None of these are for turning on and off particular details of the reconstruction, but rather setting up the particular test you want to run.

To me choosing a seeding algorithm and toggling the CKF is turning on and off particular details of the reconstruction. At the same time there are 3 ambi solvers to choose from. I don't think it is trivial to guarantee that all the combinations work (for example all seeders with/without CKF with ambi solver X).

This is where I think the script does not offer any benefit other than educational purposes, which is a valid one. It is way too complicated to wire in another flag because it can easily break the script with specific options.

@timadye
Copy link
Contributor Author

timadye commented Nov 6, 2024

I see that but what I cannot see is how the situation is improved by keeping full_chain_odd.py and full_chain_itk.py as they are in this PR. There is just another chain with numerous CLI flags and nobody knows what to start with.

Those are useful as easy-to-read examples for how to write your own script.

IMO to move forward with this the PR should also include the proposed changes to the other chains.

OK, I can make the simpler versions of these scripts and include it with this PR.

To me choosing a seeding algorithm and toggling the CKF is turning on and off particular details of the reconstruction. At the same time there are 3 ambi solvers to choose from. I don't think it is trivial to guarantee that all the combinations work (for example all seeders with/without CKF with ambi solver X).

Specifying --seeding-algorithm Orthogonal is useful to test that algorithm. This option is mostly just passed onto reconstruction.py, so the complexity isn't here. Specifying --no-ckf (which disables the downstream ambi and vertexing) is useful if you are debugging the seeding and running the CKF just slows you down. Once you are done debugging, you should run again with the full chain to see the performance effect.

Your point may perhaps apply better to the --simple-ckf option. I can remove that if you think it would improve things.

As I said before, it isn't necessary to test all the options (or any). This is a developer tool. If something doesn't work for a developer, they can fix it. As I said in my proposed disclaimer, this should never be run in the CI, let alone production.

This is where I think the script does not offer any benefit other than educational purposes, which is a valid one.

No, that is the purpose of the (simplified) full_chain_odd.py.

It is way too complicated to wire in another flag because it can easily break the script with specific options.

As I said, wiring in another flag isn't the primary use case. If it's too complicated for someone, then they won't do it.

Copy link

sonarcloud bot commented Nov 7, 2024

@paulgessinger
Copy link
Member

I really don't understand how physmon can fail here...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component - Examples Affects the Examples module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants