Simplify SubmitterHTCondor

XENONnT · Sep 2, 2024 · 0678e58 · 0678e58
1 parent 1789e90
commit 0678e58
Show file tree

Hide file tree

Showing 4 changed files with 212 additions and 297 deletions.
diff --git a/alea/submitters/README.md b/alea/submitters/README.md
@@ -42,10 +42,10 @@ htcondor_configurations:
   pegasus_transfer_threads: 4
   max_jobs_to_combine: 100
   singularity_image: "/cvmfs/singularity.opensciencegrid.org/xenonnt/montecarlo:2024.04.1"
-  wf_id: "lq_b8_cevns_30"
+  workflow_id: "lq_b8_cevns_30"
 ```
 - `template_path`: where you put your input templates. Note that **all files have to have unique names**. All templates inside will be tarred and the tarball will be uploaded to the grid when computing.
-- `cluster_size`: clustering multiple `alea-run_toymc` jobs into a single job. For example, now you expect to run 100 individual `alea-run_toymc` jobs, and you specified `cluster_size: 10`, there will be only 10 `alea-run_toymc` in the end, each containing 10 jobs to run in sequence. Unless you got crazy amount of jobs like >200, I don't recommend changing it from 1.
+- `cluster_size`: clustering multiple `alea_run_toymc` jobs into a single job. For example, now you expect to run 100 individual `alea_run_toymc` jobs, and you specified `cluster_size: 10`, there will be only 10 `alea_run_toymc` in the end, each containing 10 jobs to run in sequence. Unless you got crazy amount of jobs like >200, I don't recommend changing it from 1.
 - `request_cpus`: number of CPUs for each job. It should be larger than alea max multi-threading number, otherwise OSG will complains.
 - `request_memory`: requested memory for each job in unit of MB. Please don't put a number larger than what you need, because it will significantly reduce our available slots.
 - `request_disk`: requested disk for each job in unit of KB. Please don't put a number larger than what you need, because it will significantly reduce our available slots.
@@ -56,11 +56,11 @@ htcondor_configurations:
 - `pegasus_transfer_threads`: number of threads for transfering handled by `Pegasus`. The default 4 is good so in most cases you want to keep it.
 - `max_jobs_to_combine`: number of toymc job to combine when concluding. Be cautious to put a number larger than 200 here, since it might be too risky...
 - `singularity_image`: the jobs will be running in this singularity image.
-- `wf_id`: name of user's choice for this workflow. If not specified it will put the datetime as `wf_id`.
+- `workflow_id`: name of user's choice for this workflow. If not specified it will put the datetime as `workflow_id`.
 
 
 ### Usage
-Make sure you configured the running config well, then you just simply pass `--htcondor` into your `alea-submission` command.
+Make sure you configured the running config well, then you just simply pass `--htcondor` into your `alea_submission` command.
 
 In the end of the return, it should give you something like this:
 ```
@@ -101,11 +101,11 @@ pegasus-run /scratch/yuanlq/workflows/runs/lq_b8_cevns_30
 ```
 
 To collect the final outputs, there are two ways
-- Check your folder `/scratch/$USER/workflows/outputs/<wf_id>/`. There should be a single tarball containing all toymc files and computation results.
+- Check your folder `/scratch/$USER/workflows/outputs/<workflow_id>/`. There should be a single tarball containing all toymc files and computation results.
 - A redundant way is to get files from dCache, in which you have to use `gfal` command to approach. For example ```gfal-ls davs://xenon-gridftp.grid.uchicago.edu:2880/xenon/scratch/yuanlq/lq_b8_cevns_30/``` and to get the files, for example do ```gfal-ls davs://xenon-gridftp.grid.uchicago.edu:2880/xenon/scratch/yuanlq/lq_b8_cevns_30/00/00/```. This contains both the final tarball and all `.h5` files before tarballing. To get them you want to do something like ```gfal-copy davs://xenon-gridftp.grid.uchicago.edu:2880/xenon/scratch/yuanlq/lq_b8_cevns_30/00/00/lq_b8_cevns_30-combined_output.tar.gz . -t 7200``` Note that this command works also on Midway/DaLI.
 
 ### Example Workflow
 Here we only care about the purple ones, and the rest are generated by `Pegasus`.
-- Each individual `run_toymc_wrapper` job is computing `alea-run_toymc`. For details what it is doing, see `run_toymc_wrapper.sh`.
+- Each individual `run_toymc_wrapper` job is computing `alea_run_toymc`. For details what it is doing, see `run_toymc_wrapper.sh`.
 - The `combine` job will just collect all outputs from the `run_toymc_wrapper` jobs, and tar them into a single tarball as final output.
 <img width="1607" alt="Screen Shot 2024-05-08 at 5 24 51 PM" src="https://github.com/FaroutYLq/alea/assets/47046530/b1136330-2701-4538-b03c-8506383e4e20">
diff --git a/alea/submitters/combine.sh b/alea/submitters/combine.sh
@@ -3,14 +3,14 @@
 set -e
 
 # Extract the arguments
-wf_id=$1
+workflow_id=$1
 
 # Sanity check: these are the files in the current directory
 ls -lh
 
 # Make output filename
 # This file will be used to store the output of the workflow
-output_filename=$wf_id-combined_output.tar.gz
+output_filename=$workflow_id-combined_output.tar.gz
 
 # Tar all the .h5 files into the output file
 tar -czf $output_filename *.h5