Make a first-draft of the README

timcallow · Dec 17, 2024 · 1bfaf7b · 1bfaf7b
1 parent 45bf7c0
commit 1bfaf7b
Showing 1 changed file with 57 additions and 45 deletions.
diff --git a/README.md b/README.md
@@ -1,56 +1,68 @@
-# DataLad extension template
+# datalad-slurm: A DataLad extension for HPC (slurm) systems
 
 [![Build status](https://ci.appveyor.com/api/projects/status/g9von5wtpoidcecy/branch/main?svg=true)](https://ci.appveyor.com/project/mih/datalad-extension-template/branch/main) [![codecov.io](https://codecov.io/github/datalad/datalad-extension-template/coverage.svg?branch=main)](https://codecov.io/github/datalad/datalad-extension-template?branch=main) [![crippled-filesystems](https://github.com/datalad/datalad-extension-template/workflows/crippled-filesystems/badge.svg)](https://github.com/datalad/datalad-extension-template/actions?query=workflow%3Acrippled-filesystems) [![docs](https://github.com/datalad/datalad-extension-template/workflows/docs/badge.svg)](https://github.com/datalad/datalad-extension-template/actions?query=workflow%3Adocs)
 
 
-This repository contains an extension template that can serve as a starting point
-for implementing a [DataLad](http://datalad.org) extension. An extension can
-provide any number of additional DataLad commands that are automatically
-included in DataLad's command line and Python API.
+`datalad-slurm` is an extension to the [DataLad](http://datalad.org) package for high-performance computing (HPC), specifically slurm systems. 
 
-For a demo, clone this repository and install the demo extension via
+DataLad is a package which facilitates adherence to the [FAIR](https://www.nature.com/articles/sdata201618) research data management principles.
+
+`datalad-slurm` sits on top of the main DataLad package, and it is designed to improve the DataLad workflow on HPC systems. The package is aimed at slurm systems due to the prominence of slurm in HPC settings, but in the future it may be extended to HPC systems more generally. 
+
+`datalad-slurm` makes it easier for users to manage their research data on HPC systems with DataLad, and also solves the following conflicts of DataLad usage in HPC systems:
+
+- **Inefficient** sequential sections in highly parallel HPC jobs
+- **Critical** race conditions between git commands in concurrent jobs
+
+## Installation
+
+First, install the main [DataLad](http://datalad.org) package and its dependencies.
+
+Then, clone this repository and install the extension with:
 
     pip install -e .
 
-DataLad will now expose a new command suite with a `hello...` command.
-
-    % datalad --help |grep -B2 -A2 hello
-    *Demo DataLad command suite*
-
-      hello-cmd
-          Short description of the command
-
-To start implementing your own extension, [use this
-template](https://github.com/datalad/datalad-extension-template/generate), and
-adjust as necessary. A good approach is to
-
-- Pick a name for the new extension.
-- Look through the sources and replace `helloworld` with
-  `<newname>` (hint: `git grep helloworld` should find all
-  spots).
-- Delete the example command implementation in `datalad_helloworld/hello_cmd.py`.
-- Implement a new command, and adjust the `command_suite` in
-  `datalad_helloworld/__init__.py` to point to it.
-- Replace `hello_cmd` with the name of the new command in
-  `datalad_helloworld/tests/test_register.py` to automatically test whether the
-  new extension installs correctly.
-- Adjust the documentation in `docs/source/index.rst`. Refer to [`docs/README.md`](docs/README.md) for more information on documentation building, testing and publishing.
-- Replace this README, and/or update the links in the badges at the top.
-- Update `setup.cfg` with appropriate metadata on the new extension.
-- Generate GitHub labels for use by the "Add changelog.d snippet" and
-  "Auto-release on PR merge" workflows by using the code in the
-  `datalad/release-action` repository [as described in its
-  README](https://github.com/datalad/release-action#command-labels).
-
-You can consider filling in the provided [.zenodo.json](.zenodo.json) file with
-contributor information and [meta data](https://developers.zenodo.org/#representation)
-to acknowledge contributors and describe the publication record that is created when
-[you make your code citeable](https://guides.github.com/activities/citable-code/)
-by archiving it using [zenodo.org](https://zenodo.org/). You may also want to
-consider acknowledging contributors with the
-[allcontributors bot](https://allcontributors.org/docs/en/bot/overview).
-
-# Contributing
+## Example usage
+
+To **schedule** a slurm script:
+
+    datalad schedule --output=<output_files_or_dir> <slurm_submission_command>
+
+where `<output_files_or_dir>` are the expected outputs from the job, and `<slurm_submission_command>` is for example `sbatch submit_script`. Further optional command line arguments can be found in the documentation.
+
+To **finish** (i.e. post-process) a job that was previously scheduled and is since finished:
+
+    datalad finish <commit_hash>
+
+where `<commit_hash>` is the commit hash of the previously scheduled job. Alternatively, to post-process all scheduled jobs, or all scheduled jobs since a certain commit, one can run
+
+    datalad finish
+or
+
+    datalad finish --since=<since_commit_hash>
+
+where `<since_commit_hash>` is the commit hash before all the entries in the `git log` that you want to consider.
+
+`datalad-slurm` will flag an error for any jobs which could not be post-processed, either because they are still running, or the job failed.
+
+To **reschedule** a previously scheduled job:
+
+    datalad reschedule <schedule_commit_hash>
+
+where `<schedule_commit_hash>` is the commit hash of the previously scheduled job. There must also be a corresponding `datalad finish` command to the original `datalad schedule`, otherwise `datalad reschedule` will throw an error.
+
+In the lingo of the original DataLad package, the combination of `datalad schedule + datalad finish` is similar to `datalad run`, and `datalad reschedule + datalad finish` is similar to `datalad rerun`. One important difference is that the `datalad-slurm` commands always produce a pair of commits in the git history, whereas `datalad run` produces just one commit, and `datalad rerun` produces one or no commits, depending if there is any change to the outputs.
+
+The git history might look a bit like this after running a few of these commands:
+
+    4992a23 [DATALAD FINISH] Processed batch job 9166291: Complete
+    732264c [DATALAD RESCHEDULE] Submitted batch job 9166291: Pending
+    0c982f3 [DATALAD FINISH] Processed batch job 9163380: Complete
+    4c021f4 [DATALAD SCHEDULE] Submitted batch job 9163380: Pending
+
+## Contributing
+
+The `datalad-slurm` extension is still in the very early stages of development. We welcome contributors and testers of the package. Please document any issues on GitHub and we will try to resolve them.
 
 See [CONTRIBUTING.md](CONTRIBUTING.md) if you are interested in internals or
 contributing to the project.