Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preliminary api client and parsl apps to export data and start training task #25

Draft
wants to merge 32 commits into
base: develop
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
16aca6e
Preliminary escriptorium api client
rlskoeser Nov 5, 2024
486e777
Custom result class for API Task information
rlskoeser Nov 5, 2024
c76540e
Add api method to update model file in escriptorium
rlskoeser Nov 5, 2024
6344e63
Add method to monitor export and download resulting file
rlskoeser Nov 5, 2024
684a5cc
Preliminary parsl apps for download and train, preliminary train script
rlskoeser Nov 5, 2024
e3bf0ad
Use current user api endpoint and simplify required parameters
rlskoeser Nov 6, 2024
5190854
Improve error handling / reporting
rlskoeser Nov 6, 2024
d669d68
suggested additional arguments for train.py
cmroughan Nov 11, 2024
2329224
Update src/htr2hpc/train_apps.py
rlskoeser Nov 12, 2024
c9190de
Update src/htr2hpc/api_client.py
rlskoeser Nov 12, 2024
424764e
Update src/htr2hpc/api_client.py
rlskoeser Nov 12, 2024
c35d805
Minor cleanup on api and preliminary training code
rlskoeser Nov 11, 2024
4dd7550
API methods for models, document parts, and util method to download file
rlskoeser Nov 12, 2024
35671d5
Create training data directly from document parts api response
rlskoeser Nov 12, 2024
78fa21d
Make api methods more descriptive & distinct
rlskoeser Nov 12, 2024
334cab2
Improve download filename logic
rlskoeser Nov 14, 2024
516e0a7
Add & use working directory param; document other needed parameters
rlskoeser Nov 14, 2024
01fe426
Clean up with ruff
rlskoeser Nov 14, 2024
d3a7cb9
Restructure parsl app/train code; configure package script for training
rlskoeser Nov 14, 2024
d8abafe
Adjust slurm/srun partition and other configurations
rlskoeser Nov 14, 2024
6d50a8d
Revise slurm provider configuration
rlskoeser Nov 14, 2024
9c66fcf
Update segtrain command with paths for output model and logs
rlskoeser Nov 14, 2024
57f1b85
Configure parsl app executor at the right level
rlskoeser Nov 14, 2024
cce97b0
Adjust paths; use absolute paths for training task
rlskoeser Nov 14, 2024
e248479
Calculate absolute paths before changing working directory
rlskoeser Nov 14, 2024
f03aeee
Add logic for default kraken models if model id is not specified
rlskoeser Nov 14, 2024
8f325a6
Add api method and notes for getting transcription text content
rlskoeser Nov 14, 2024
fd1cc34
Adjust arguments for the two modes of training
rlskoeser Nov 15, 2024
005ff36
Remove log dir option since nothing is getting logged
rlskoeser Nov 15, 2024
32b0ab1
Preliminary instructions for installing and running htr2hpc-train
rlskoeser Nov 15, 2024
af3dc25
Increase walltime in hpc parsl config
rlskoeser Nov 15, 2024
a47a143
Update to use new subparser mode instead of old job arg
rlskoeser Nov 15, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -25,15 +25,22 @@ classifiers = [
"Programming Language :: Python :: Implementation :: PyPy",
]
dependencies = [
"pucas>=0.9",
"psutil"
"pucas>=0.9", # TODO: move to an optional group for webapp integration only?
"psutil",
"requests",
"kraken",
"humanize",
"parsl",
]

[project.urls]
Documentation = "https://github.com/Princeton-CDH/htr2hpc#readme"
Issues = "https://github.com/Princeton-CDH/htr2hpc/issues"
Source = "https://github.com/Princeton-CDH/htr2hpc"

[project.scripts]
htr2hpc-train = "htr2hpc.train.run:main"

[tool.hatch.version]
path = "src/htr2hpc/__init__.py"

Expand Down
Loading