Skip to content

Commit

Permalink
Merge pull request #27 from timcallow/fix_broken_docs
Browse files Browse the repository at this point in the history
Fix broken docs build
  • Loading branch information
timcallow authored Jan 7, 2025
2 parents e6edcad + 60917d5 commit c51616b
Show file tree
Hide file tree
Showing 5 changed files with 73 additions and 15 deletions.
8 changes: 4 additions & 4 deletions docs/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -87,9 +87,9 @@ qthelp:
@echo
@echo "Build finished; now you can run "qcollectiongenerator" with the" \
".qhcp project file in $(BUILDDIR)/qthelp, like this:"
@echo "# qcollectiongenerator $(BUILDDIR)/qthelp/datalad_helloworld.qhcp"
@echo "# qcollectiongenerator $(BUILDDIR)/qthelp/datalad_slurm.qhcp"
@echo "To view the help file:"
@echo "# assistant -collectionFile $(BUILDDIR)/qthelp/datalad_helloworld.qhc"
@echo "# assistant -collectionFile $(BUILDDIR)/qthelp/datalad_slurm.qhc"

applehelp:
$(SPHINXBUILD) -b applehelp $(ALLSPHINXOPTS) $(BUILDDIR)/applehelp
Expand All @@ -104,8 +104,8 @@ devhelp:
@echo
@echo "Build finished."
@echo "To view the help file:"
@echo "# mkdir -p $$HOME/.local/share/devhelp/datalad_helloworld"
@echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/datalad_helloworld"
@echo "# mkdir -p $$HOME/.local/share/devhelp/datalad_slurm"
@echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/datalad_slurm"
@echo "# devhelp"

epub:
Expand Down
4 changes: 3 additions & 1 deletion docs/source/cli_reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,6 @@ Command line reference
.. toctree::
:maxdepth: 1

generated/man/datalad-hello-cmd
generated/man/datalad-schedule
generated/man/datalad-finish
generated/man/datalad-reschedule
10 changes: 5 additions & 5 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
#
# datalad_helloworld documentation build configuration file, created by
# datalad_slurm documentation build configuration file, created by
# sphinx-quickstart on Tue Oct 13 08:41:19 2015.
#
# This file is execfile()d with the current directory set to its
Expand All @@ -24,7 +24,7 @@
)
from os import pardir

import datalad_helloworld
import datalad_slurm

# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
Expand All @@ -40,7 +40,7 @@
try:
subprocess.run(
args=[setup_py_path, 'build_manpage',
'--cmdsuite', 'datalad_helloworld:command_suite',
'--cmdsuite', 'datalad_slurm:command_suite',
'--manpath', abspath(opj(
dirname(setup_py_path), 'build', 'man')),
'--rstpath', opj(dirname(__file__), 'generated', 'man'),
Expand Down Expand Up @@ -89,14 +89,14 @@
master_doc = 'index'

# General information about the project.
project = u'Datalad Extension Template'
project = u'datalad-slurm extension'
copyright = u'2018-{}, DataLad team'.format(datetime.datetime.now().year)
author = u'DataLad team'

# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
version = datalad_helloworld.__version__
version = datalad_slurm.__version__
release = version

# The language for content autogenerated by Sphinx. Refer to documentation
Expand Down
62 changes: 58 additions & 4 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
@@ -1,9 +1,63 @@
DataLad extension template
**************************
datalad-slurm: A DataLad extension for HPC (slurm) systems
**********************************************************

This is a template for creating a `DataLad <http://datalad.org>`__ extension
that equips DataLad with additional functionality.
``datalad-slurm`` is an extension to the `DataLad <http://datalad.org>`_ package for high-performance computing (HPC), specifically slurm systems.

DataLad is a package which facilitates adherence to the `FAIR <https://www.nature.com/articles/sdata201618>`_ research data management principles.

``datalad-slurm`` sits on top of the main DataLad package, and it is designed to improve the DataLad workflow on HPC systems. The package is aimed at slurm systems due to the prominence of slurm in HPC settings, but in the future it may be extended to HPC systems more generally.

``datalad-slurm`` makes it easier for users to manage their research data on HPC systems with DataLad, and also solves the following conflicts of DataLad usage in HPC systems:

* **Inefficient** sequential sections in highly parallel HPC jobs
* **Critical** race conditions between git commands in concurrent jobs

Installation
------------
First, install the main `DataLad <http://datalad.org>`_ package and its dependencies.

Then, clone this repository and install the extension with::

pip install -e .

Example usage
-------------
To **schedule** a slurm script::

datalad schedule --output=<output_files_or_dir> <slurm_submission_command>

where ``<output_files_or_dir>`` are the expected outputs from the job, and ``<slurm_submission_command>`` is for example ``sbatch submit_script``. Further optional command line arguments can be found in the documentation.

To **finish** (i.e. post-process) a job that was previously scheduled and is since finished::

datalad finish <commit_hash>

where ``<commit_hash>`` is the commit hash of the previously scheduled job. Alternatively, to post-process all scheduled jobs, or all scheduled jobs since a certain commit, one can run::

datalad finish

or::

datalad finish --since=<since_commit_hash>

where ``<since_commit_hash>`` is the commit hash before all the entries in the ``git log`` that you want to consider.

``datalad-slurm`` will flag an error for any jobs which could not be post-processed, either because they are still running, or the job failed.

To **reschedule** a previously scheduled job::

datalad reschedule <schedule_commit_hash>

where ``<schedule_commit_hash>`` is the commit hash of the previously scheduled job. There must also be a corresponding ``datalad finish`` command to the original ``datalad schedule``, otherwise ``datalad reschedule`` will throw an error.

In the lingo of the original DataLad package, the combination of ``datalad schedule + datalad finish`` is similar to ``datalad run``, and ``datalad reschedule + datalad finish`` is similar to ``datalad rerun``. One important difference is that the ``datalad-slurm`` commands always produce a pair of commits in the git history, whereas ``datalad run`` produces just one commit, and ``datalad rerun`` produces one or no commits, depending if there is any change to the outputs.

The git history might look a bit like this after running a few of these commands::

4992a23 [DATALAD FINISH] Processed batch job 9166291: Complete
732264c [DATALAD RESCHEDULE] Submitted batch job 9166291: Pending
0c982f3 [DATALAD FINISH] Processed batch job 9163380: Complete
4c021f4 [DATALAD SCHEDULE] Submitted batch job 9163380: Pending

API
===
Expand Down
4 changes: 3 additions & 1 deletion docs/source/python_reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,6 @@ High-level API commands
.. autosummary::
:toctree: generated

hello_cmd
schedule
finish
reschedule

0 comments on commit c51616b

Please sign in to comment.