Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chatbot parser generic output 2 #677

Draft
wants to merge 153 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
153 commits
Select commit Hold shift + click to select a range
1ebc363
initial commit
EwDa291 Aug 8, 2024
34df842
Merge branch 'hpcugent:main' into chatbot_parser
EwDa291 Aug 8, 2024
10edb20
some cleanup
EwDa291 Aug 8, 2024
85a93ec
used jinja to replace macros
EwDa291 Aug 9, 2024
dfff5fa
adapt if-mangler to accommodate for nested if-clauses
EwDa291 Aug 9, 2024
649ddec
adapt the parser to take all files as input, not all files get parsed…
EwDa291 Aug 9, 2024
2116d6e
adapt the parser to take all files as input, not all files get parsed…
EwDa291 Aug 9, 2024
159aa62
small update, not important
EwDa291 Aug 9, 2024
75765e5
change to the templates
EwDa291 Aug 9, 2024
57d9cfe
change to accommodate for more nested if-clauses
EwDa291 Aug 9, 2024
75d345b
Delete scripts/HPC chatbot preprocessor/start_checker.py
EwDa291 Aug 9, 2024
ff7a9fc
make sure files with duplicate names between normal files and linux-t…
EwDa291 Aug 12, 2024
47a33b7
Merge branch 'chatbot_parser' of https://github.com/EwDa291/vsc_user_…
EwDa291 Aug 12, 2024
7d279d6
fixed the problem of some files being written in reST instead of mark…
EwDa291 Aug 12, 2024
8047572
some small fixes
EwDa291 Aug 12, 2024
7d1c5ed
remove try-except-structure
EwDa291 Aug 13, 2024
984b0cd
collapse all code into one file
EwDa291 Aug 13, 2024
8f5eeaa
Rename file
EwDa291 Aug 13, 2024
2b97b7a
cleanup repository
EwDa291 Aug 13, 2024
b595301
Rename directory
EwDa291 Aug 13, 2024
90c8ab7
add a main function
EwDa291 Aug 13, 2024
b8ae706
make file paths non os-specific
EwDa291 Aug 13, 2024
b751497
use docstrings to document the functions
EwDa291 Aug 13, 2024
0f8eb5d
rewrite the if-mangler to make it more readable
EwDa291 Aug 13, 2024
9938e92
got rid of most global variables
EwDa291 Aug 13, 2024
508b22c
fixed some issues with if statements
EwDa291 Aug 13, 2024
a25ce2d
fixed some issues with if statements
EwDa291 Aug 13, 2024
80d0535
got rid of all global variables
EwDa291 Aug 13, 2024
9163a75
small changes to make file more readable
EwDa291 Aug 14, 2024
1dcffc1
codeblocks, tips, warnings and info reformatted
EwDa291 Aug 14, 2024
4d7fbdb
small optimisations
EwDa291 Aug 14, 2024
671f7f3
small optimisations
EwDa291 Aug 14, 2024
e5c39bd
initial commit
EwDa291 Aug 14, 2024
c6492fc
added requirements
EwDa291 Aug 14, 2024
aff8198
added requirements and usage info
EwDa291 Aug 14, 2024
a981002
minor changes to the print statements
EwDa291 Aug 14, 2024
1f3b343
reworked function to take care of html structures
EwDa291 Aug 16, 2024
b6388d3
Merge branch 'hpcugent:main' into chatbot_parser
EwDa291 Aug 16, 2024
48cad97
filter out images
EwDa291 Aug 16, 2024
df58f23
get rid of backquotes, asterisks, pluses and underscores used for for…
EwDa291 Aug 16, 2024
c423e07
dump to json files instead of txt files
EwDa291 Aug 16, 2024
2c333fe
cleaned up parser with macros
EwDa291 Aug 16, 2024
ce52352
cleaned up parser with macros
EwDa291 Aug 16, 2024
5db34af
cleaned up parser with macros
EwDa291 Aug 16, 2024
4226d28
Update README.md
EwDa291 Aug 19, 2024
d730a26
Update README.md
EwDa291 Aug 19, 2024
f3182e3
added section about restrictions on input files
EwDa291 Aug 19, 2024
aee54de
Merge branch 'hpcugent:main' into chatbot_parser
EwDa291 Aug 19, 2024
675bec5
adapted section about restrictions on input files
EwDa291 Aug 19, 2024
f1e58ef
adapted section about restrictions on input files
EwDa291 Aug 19, 2024
2bf1075
Merge branch 'chatbot_parser' of https://github.com/EwDa291/vsc_user_…
EwDa291 Aug 19, 2024
a168509
change variables to be lowercase
EwDa291 Aug 19, 2024
09b86c9
take out some copy pasting
EwDa291 Aug 19, 2024
f95b99e
added warning about long filepaths
EwDa291 Aug 19, 2024
06bb7b9
fixing typos
EwDa291 Aug 19, 2024
2f3e5b3
take out copy pasting
EwDa291 Aug 19, 2024
0c4dbe8
first draft version of the restructured script to accommodate for the…
EwDa291 Aug 20, 2024
38c4572
added support to filter out collapsable admonitions
EwDa291 Aug 20, 2024
5cbd653
attempt at fix for problems with jinja include, not working yet
EwDa291 Aug 20, 2024
0e6f8b2
fixed an issue with jinja templates
EwDa291 Aug 21, 2024
cd77837
added docstrings to new functions
EwDa291 Aug 21, 2024
98eb695
only add necessary if-statements in front of non-if-complete sections
EwDa291 Aug 21, 2024
27457e3
fixed some more jinja problems
EwDa291 Aug 21, 2024
bb72287
implemented extra test to make sure generic files dont accidentally g…
EwDa291 Aug 21, 2024
67cb19e
make sure empty os-specific files are not saved
EwDa291 Aug 21, 2024
cf9834a
clean up unused code
EwDa291 Aug 21, 2024
da32459
introduce more macros
EwDa291 Aug 21, 2024
093200b
reintroduce logic to remove unnecessary directories
EwDa291 Aug 21, 2024
5d0ffe9
added functionality to include links or leave them out
EwDa291 Aug 21, 2024
a3e34a9
added functionality to include links or leave them out
EwDa291 Aug 21, 2024
7c6154b
adapt filenames to allow for splitting on something other than subtitles
EwDa291 Aug 21, 2024
8d5b50d
making some changes to prepare to add paragraph level splitting tomorrow
EwDa291 Aug 21, 2024
0c10376
making some changes to prepare to add paragraph level splitting tomorrow
EwDa291 Aug 21, 2024
f8ee860
making some changes to prepare to add paragraph level splitting tomorrow
EwDa291 Aug 21, 2024
6533733
adapted the parsing script to allow for testing in a semi-efficient way
EwDa291 Aug 21, 2024
2e7a00f
added test for make_valid_title
EwDa291 Aug 21, 2024
f5e0579
removed useless lines from testscript
EwDa291 Aug 21, 2024
6757b4f
First attempt at splitting in paragraphs (need for other fixes for ti…
EwDa291 Aug 22, 2024
6d9558d
make two functions for different ways of dividing the text
EwDa291 Aug 22, 2024
2c7025a
added docstrings to new functions
EwDa291 Aug 22, 2024
ae99bb9
update test for valid titles
EwDa291 Aug 22, 2024
084b421
fixed problem with splitting os-specific text (metadata not fixed yet)
EwDa291 Aug 22, 2024
cf7f5f0
fix for metadata of os-specific sections
EwDa291 Aug 22, 2024
b7c10d3
clean up temporary version
EwDa291 Aug 22, 2024
4a441f3
added command line options for custom macros
EwDa291 Aug 22, 2024
662134f
small fix to macros
EwDa291 Aug 22, 2024
05eab4a
clean up test for valid title
EwDa291 Aug 22, 2024
b85a8fb
add a test for write_metadata
EwDa291 Aug 22, 2024
39a3c99
added functionality to split on paragraphs
EwDa291 Aug 23, 2024
af9e6cc
clean up
EwDa291 Aug 23, 2024
f4163a7
clean up
EwDa291 Aug 23, 2024
833f964
further clean up and added shebang
EwDa291 Aug 23, 2024
79b1a56
clean up
EwDa291 Aug 23, 2024
cec154c
added test for if mangler
EwDa291 Aug 23, 2024
2f4a277
clean up
EwDa291 Aug 23, 2024
cd0c8eb
clean up customizable options
EwDa291 Aug 23, 2024
3be262a
further adapt the script to be able to test it
EwDa291 Aug 26, 2024
1d32aab
make changes to usage in command line to be more intuitive
EwDa291 Aug 26, 2024
5902c96
first revised version of the README
EwDa291 Aug 26, 2024
6f97d5f
Merge branch 'hpcugent:main' into chatbot_parser
EwDa291 Aug 26, 2024
6e48800
added docstring to main function
EwDa291 Aug 26, 2024
0bc440b
include chatbot_prepprocessor
EwDa291 Aug 26, 2024
e6e6023
added options for source and destination directories
EwDa291 Aug 26, 2024
a6d99d9
cleanup
EwDa291 Aug 26, 2024
2be834f
cleanup
EwDa291 Aug 26, 2024
532543a
cleanup
EwDa291 Aug 26, 2024
107464e
relocate test files
EwDa291 Aug 26, 2024
dd64381
update arguments of if mangler
EwDa291 Aug 26, 2024
ef3fd58
relocate full test files
EwDa291 Aug 26, 2024
4d7db8f
Revert "update arguments of if mangler"
EwDa291 Aug 26, 2024
df9bac5
Revert "relocate full test files"
EwDa291 Aug 26, 2024
631d9e9
update test to adapt to new arguments in if mangler
EwDa291 Aug 26, 2024
c6e600d
relocated full test files
EwDa291 Aug 26, 2024
d1c6194
Rename test_paragraph_split_1.md to test_paragraph_split_1_input.md
EwDa291 Aug 26, 2024
695ffd6
Rename test_title_split_1.md to test_title_split_1_input.md
EwDa291 Aug 26, 2024
af4832b
smal fix
EwDa291 Aug 26, 2024
8805c8c
test text for paragraph split
EwDa291 Aug 26, 2024
a265ffd
start of a fix for double title problem, not done yet
EwDa291 Aug 26, 2024
6c2a61c
Fix for double title bug when splitting on paragraph
EwDa291 Aug 27, 2024
ed08879
Fix bug for empty linklist in metadata
EwDa291 Aug 27, 2024
176af13
fix bug where too many directories were sometimes created
EwDa291 Aug 27, 2024
d4ceac8
test of full script, test files not ready to be pushed yet
EwDa291 Aug 27, 2024
815a863
updated requirements.txt
EwDa291 Aug 27, 2024
d15469f
updated docstring in main function
EwDa291 Aug 27, 2024
daa6b36
add support for comments for the bot to be included in the source files
EwDa291 Aug 27, 2024
4c19f44
changed the default for min paragraph length
EwDa291 Aug 27, 2024
9a6ff58
added test files for full script test
EwDa291 Aug 27, 2024
56543f0
small fix for double title bug
EwDa291 Aug 27, 2024
52a3861
added examples of output of the script when splitting on paragraphs w…
EwDa291 Aug 27, 2024
692e77b
fix for issue with html links
EwDa291 Aug 27, 2024
7f493a1
fix for issue with html links
EwDa291 Aug 27, 2024
0e34396
fix for issue with relative links to the same document
EwDa291 Aug 27, 2024
fa00044
added test for replace_markdown_markers
EwDa291 Aug 27, 2024
b3952b2
fix to small inconsistency in metadata
EwDa291 Aug 27, 2024
73072bf
added test for insert_links
EwDa291 Aug 27, 2024
3161309
make sure paragraphs only include full lists
EwDa291 Aug 28, 2024
7d4d7f9
Merge branch 'hpcugent:main' into chatbot_parser
EwDa291 Aug 28, 2024
3407be3
adapted to the new source files
EwDa291 Aug 28, 2024
6d04bbc
add source-directory to metadata and verbose mode
EwDa291 Aug 28, 2024
f33cfb3
added verbose mode
EwDa291 Aug 28, 2024
1c389d7
Merge branch 'hpcugent:main' into chatbot_parser
EwDa291 Aug 28, 2024
3227f19
Added limitation on lists
EwDa291 Aug 29, 2024
67aed53
fix for non os-specific if-statement not being recognised
EwDa291 Aug 29, 2024
9e297b1
new test for links
EwDa291 Aug 29, 2024
b6b8610
new test to make sure lists are kept as one section
EwDa291 Aug 29, 2024
57a2139
updated test_file for list test
EwDa291 Aug 29, 2024
170a10c
dropped <> around links and started new function to calculate length …
EwDa291 Aug 30, 2024
04efff6
removed parsed mds
EwDa291 Aug 30, 2024
1ef1f10
Changed paragraphs to decide length based on tokens instead of charac…
EwDa291 Aug 30, 2024
621c0a3
Changed paragraphs to decide length based on tokens instead of charac…
EwDa291 Aug 30, 2024
adf364d
Changed paragraphs to decide length based on tokens instead of charac…
EwDa291 Aug 30, 2024
89f1ab0
Added output of chatbot_parser script part 2
EwDa291 Aug 30, 2024
1545c4b
removing unnecessary files
EwDa291 Aug 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
MATLAB
note
To run a MATLAB program on the HPC-UGent infrastructure you must compile it first,
because the MATLAB license server is not accessible from cluster workernodes
(except for the interactive debug cluster).
Compiling MATLAB programs is only possible on the interactive debug cluster,
not on the HPC-UGent login nodes where resource limits w.r.t. memory and max. number of progress are too strict.
Why is the MATLAB compiler required?
The main reason behind this alternative way of using MATLAB is
licensing: only a limited number of MATLAB sessions can be active at the
same time. However, once the MATLAB program is compiled using the MATLAB
compiler, the resulting stand-alone executable can be run without
needing to contact the license server.
Note that a license is required for the MATLAB Compiler, see
https://nl.mathworks.com/help/compiler/index.html. If the mcc
command is provided by the MATLAB installation you are using, the MATLAB
compiler can be used as explained below.
Only a limited amount of MATLAB sessions can be active at the same time
because there are only a limited amount of MATLAB research licenses
available on the UGent MATLAB license server. If each job would need a
license, licenses would quickly run out.
How to compile MATLAB code
Compiling MATLAB code can only be done from the login nodes, because
only login nodes can access the MATLAB license server, workernodes on
clusters cannot.
To access the MATLAB compiler, the MATLAB module should be loaded
first. Make sure you are using the same MATLAB version to compile and
to run the compiled MATLAB program.
$ module avail MATLAB/
----------------------/apps/gent/RHEL8/zen2-ib/modules/all----------------------
MATLAB/2021b MATLAB/2022b-r5 (D)
$ module load MATLAB/2021b
After loading the MATLAB module, the mcc command can be used. To get
help on mcc, you can run mcc -?.
To compile a standalone application, the -m flag is used (the -v
flag means verbose output). To show how mcc can be used, we use the
magicsquare example that comes with MATLAB.
First, we copy the magicsquare.m example that comes with MATLAB to
example.m:
cp $EBROOTMATLAB/extern/examples/compiler/magicsquare.m example.m
To compile a MATLAB program, use mcc -mv:
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"main_title": "MATLAB",
"subtitle": "How-to-compile-MATLAB-code",
"source_file": "../../mkdocs/docs/HPC/MATLAB.md",
"title_depth": 2,
"directory": "MATLAB",
"links": {
"0": "https://docs.hpc.ugent.be/interactive_debug"
},
"parent_title": "",
"previous_title": null,
"next_title": "MATLAB_paragraph_2",
"OS": "generic",
"reference_link": "https://docs.hpc.ugent.be/MATLAB/#how-to-compile-matlab-code"
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
mcc -mv example.m
Opening log file: /user/home/gent/vsc400/vsc40000/java.log.34090
Compiler version: 8.3 (R2021b)
Dependency analysis by REQUIREMENTS.
Parsing file "/user/home/gent/vsc400/vsc40000/example.m"
(Referenced from: "Compiler Command Line").
Deleting 0 temporary MEX authorization files.
Generating file "/user/home/gent/vsc400/vsc40000/readme.txt".
Generating file "run\_example.sh".
Libraries
To compile a MATLAB program that needs a library, you can use the
-I library_path flag. This will tell the compiler to also look for
files in library_path.
It's also possible to use the -a path flag. That will result in all
files under the path getting added to the final executable.
For example, the command mcc -mv example.m -I examplelib -a datafiles
will compile example.m with the MATLAB files in examplelib, and will
include all files in the datafiles directory in the binary it
produces.
Memory issues during compilation
If you are seeing Java memory issues during the compilation of your
MATLAB program on the login nodes, consider tweaking the default maximum
heap size (128M) of Java using the _JAVA_OPTIONS environment variable
with:
export _JAVA_OPTIONS="-Xmx64M"
The MATLAB compiler spawns multiple Java processes. Because of the
default memory limits that are in effect on the login nodes, this might
lead to a crash of the compiler if it's trying to create to many Java
processes. If we lower the heap size, more Java processes will be able
to fit in memory.
Another possible issue is that the heap size is too small. This could
result in errors like:
Error: Out of memory
A possible solution to this is by setting the maximum heap size to be
bigger:
export _JAVA_OPTIONS="-Xmx512M"
Multithreading
MATLAB can only use the cores in a single workernode (unless the
Distributed Computing toolbox is used, see
https://nl.mathworks.com/products/distriben.html).
The amount of workers used by MATLAB for the parallel toolbox can be
controlled via the parpool function: parpool(16) will use 16
workers. It's best to specify the amount of workers, because otherwise
you might not harness the full compute power available (if you have too
few workers), or you might negatively impact performance (if you have
too many workers). By default, MATLAB uses a fixed number of workers
(12).
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"main_title": "MATLAB",
"subtitle": "Multithreading",
"source_file": "../../mkdocs/docs/HPC/MATLAB.md",
"title_depth": 2,
"directory": "MATLAB",
"parent_title": "",
"previous_title": "MATLAB_paragraph_1",
"next_title": "MATLAB_paragraph_3",
"OS": "generic",
"reference_link": "https://docs.hpc.ugent.be/MATLAB/#multithreading"
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
You should use a number of workers that is equal to the number of cores
you requested when submitting your job script (the ppn value, see Generic resource requirements).
You can determine the right number of workers to use via the following
code snippet in your MATLAB program:
% specify the right number of workers (as many as there are cores available in the job) when creating the parpool
c = parcluster('local')
pool = parpool(c.NumWorkers)
See also [the parpool
documentation](https://nl.mathworks.com/help/distcomp/parpool.html).
Java output logs
Each time MATLAB is executed, it generates a Java log file in the users
home directory. The output log directory can be changed using:
MATLAB_LOG_DIR=<OUTPUT_DIR>
where <OUTPUT_DIR> is the name of the desired output directory. To
create and use a temporary directory for these logs:
# create unique temporary directory in $TMPDIR (or /tmp/$USER if
$TMPDIR is not defined)
# instruct MATLAB to use this directory for log files by setting $MATLAB_LOG_DIR
$ export MATLAB_LOG_DIR=$ (mktemp -d -p $TMPDIR:-/tmp/$USER)
You should remove the directory at the end of your job script:
rm -rf $MATLAB_LOG_DIR
Cache location
When running, MATLAB will use a cache for performance reasons. This
location and size of this cache can be changed through the
MCR_CACHE_ROOT and MCR_CACHE_SIZE environment variables.
The snippet below would set the maximum cache size to 1024MB and the
location to /tmp/testdirectory.
export MATLAB_CACHE_ROOT=/tmp/testdirectory
export MATLAB_CACHE_SIZE=1024M
So when MATLAB is running, it can fill up to 1024MB of cache in
/tmp/testdirectory.
MATLAB job script
All of the tweaks needed to get MATLAB working have been implemented in
an example job script. This job script is also available on the HPC.

#!/bin/bash
#PBS -l nodes=1:ppn=1
#PBS -l walltime=1:0:0
#
# Example (single-core) MATLAB job script
#
# make sure the MATLAB version matches with the one used to compile the MATLAB program!
module load MATLAB/2021b
# use temporary directory (not $HOME) for (mostly useless) MATLAB log files
# subdir in $TMPDIR (if defined, or /tmp otherwise)
export MATLAB_LOG_DIR=$(mktemp -d -p ${TMPDIR:-/tmp})
# configure MATLAB Compiler Runtime cache location & size (1GB)
# use a temporary directory in /dev/shm (i.e. in memory) for performance reasons
export MCR_CACHE_ROOT=$(mktemp -d -p /dev/shm)
export MCR_CACHE_SIZE=1024MB
# change to directory where job script was submitted from
cd $PBS_O_WORKDIR
# run compiled example MATLAB program 'example', provide '5' as input argument to the program
# $EBROOTMATLAB points to MATLAB installation directory
./run_example.sh $EBROOTMATLAB 5
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"main_title": "MATLAB",
"subtitle": "MATLAB-job-script",
"source_file": "../../mkdocs/docs/HPC/MATLAB.md",
"title_depth": 2,
"directory": "MATLAB",
"links": {
"0": "https://docs.hpc.ugent.be/running_batch_jobs/#generic-resource-requirements"
},
"parent_title": "",
"previous_title": "MATLAB_paragraph_2",
"next_title": null,
"OS": "generic",
"reference_link": "https://docs.hpc.ugent.be/MATLAB/#matlab-job-script"
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
---
hide: "toc"
---
Welcome to the HPC-UGent documentation
---
Use the menu on the left to navigate, or use the search box on the top right.
You are viewing documentation intended for people using *.
Use the OS dropdown in the top bar to switch to a different operating system.
Quick links
- Getting Started | Getting Access
- Recording of HPC-UGent intro
- Linux Tutorial
- Hardware overview
- FAQ | Troubleshooting | Best practices | Known issues

If you find any problems in this documentation, please report them by mail to <[email protected]> or open a pull request.
If you still have any questions, you can contact the HPC-UGent team.
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
{
"main_title": "index",
"subtitle": "Welcome-to-the-HPC-UGent-documentation",
"source_file": "../../mkdocs/docs/HPC/index.md",
"title_depth": 1,
"directory": "index",
"links": {
"0": "https://docs.hpc.ugent.be/getting_started",
"1": "https://docs.hpc.ugent.be/account",
"2": "https://www.ugent.be/hpc/en/training/introhpcugent-recording",
"3": "https://docs.hpc.ugent.be/linux-tutorial",
"4": "https://www.ugent.be/hpc/en/infrastructure",
"5": "https://docs.hpc.ugent.be/FAQ",
"6": "https://docs.hpc.ugent.be/troubleshooting",
"7": "https://docs.hpc.ugent.be/best_practices",
"8": "https://docs.hpc.ugent.be/known_issues",
"9": "https://github.com/hpcugent/vsc_user_docs",
"10": "https://www.ugent.be/hpc/en/support"
},
"parent_title": "",
"previous_title": null,
"next_title": null,
"OS": "generic",
"reference_link": "https://docs.hpc.ugent.be"
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
Interactive and debug cluster
Purpose
The purpose of this cluster is to give the user an environment where
there should be no waiting in the queue to get access to a limited
number of resources. This environment allows a user to immediately
start working, and is the ideal place for interactive work such as
development, debugging and light production workloads (typically
sufficient for training and/or courses).
This environment should be seen as an extension or even replacement of the login nodes,
instead of a dedicated compute resource. The interactive cluster is
overcommitted, which means that more CPU cores can be requested for
jobs than physically exist in the cluster. Obviously, the performance of
this cluster heavily depends on the workloads and the actual overcommit
usage. Be aware that jobs can slow down or speed up during their
execution.
Due to the restrictions and sharing of the CPU resources (see
section Restrictions and overcommit factor) jobs on this cluster
should normally start more or less immediately. The tradeoff is that
performance must not be an issue for the submitted jobs. This means that
typical workloads for this cluster should be limited to:
- Interactive jobs (see
chapter Running interactive jobs)
- Cluster desktop sessions (see
chapter Using the HPC-UGent web portal)
- Jobs requiring few resources
- Debugging programs
- Testing and debugging job scripts
Submitting jobs
To submit jobs to the HPC-UGent interactive and debug cluster nicknamed
donphan, first use:
module swap cluster/donphan
Then use the familiar qsub, qstat, etc. commands (see
chapter Running batch jobs).
Restrictions and overcommit factor
Some limits are in place for this cluster:
- each user may have at most 5 jobs in the queue (both running and
waiting to run);
- at most 3 jobs per user can be running at the same time;
- running jobs may allocate no more than 8 CPU cores and no more than
27200 MiB of memory in total, per user;
In addition, the cluster has an overcommit factor of 6. This means that
6 times more cores can be allocated than physically exist.
Simultaneously, the default memory per core is 6 times less than what
would be available on a non-overcommitted cluster.
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
{
"main_title": "interactive_debug",
"subtitle": "Restrictions-and-overcommit-factor",
"source_file": "../../mkdocs/docs/HPC/interactive_debug.md",
"title_depth": 3,
"directory": "interactive_debug",
"links": {
"0": "https://docs.hpc.ugent.be/interactive_debug/#restrictions-and-overcommit-factor",
"1": "https://docs.hpc.ugent.be/running_interactive_jobs/#running-interactive-jobs",
"2": "https://docs.hpc.ugent.be/web_portal/#using-the-hpc-ugent-web-portal",
"3": "https://docs.hpc.ugent.be/running_batch_jobs/#running-batch-jobs"
},
"parent_title": "",
"previous_title": null,
"next_title": "interactive_debug_paragraph_2",
"OS": "generic",
"reference_link": "https://docs.hpc.ugent.be/interactive_debug/#restrictions-and-overcommit-factor"
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
Please note that based on the (historical) workload of the interactive
and debug cluster, the above restrictions and the overcommitment ratio
might change without prior notice.
Shared GPUs
Each node in the donphan cluster has a relatively small GPU that is shared between all jobs.
This means that you don't need to reserve it and thus possibly wait for it.
But this also has a downside for performance and security: jobs might be competing for the same GPU resources (cores, memory or encoders) without
any preset fairshare and there is no guarantee one job cannot access another job's memory
(as opposed to having reserved GPUs in the GPU clusters).
All software should behave the same as on the dedicated GPU clusters (e.g. using CUDA or OpenGL acceleration
from a cluster desktop via the webportal).
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"main_title": "interactive_debug",
"subtitle": "Shared-GPUs",
"source_file": "../../mkdocs/docs/HPC/interactive_debug.md",
"title_depth": 3,
"directory": "interactive_debug",
"parent_title": "",
"previous_title": "interactive_debug_paragraph_1",
"next_title": null,
"OS": "generic",
"reference_link": "https://docs.hpc.ugent.be/interactive_debug/#shared-gpus"
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
Introduction to HPC
What is HPC?
"High Performance Computing" (HPC) is computing on a "Supercomputer", a computer with at the
frontline of contemporary processing capacity -- particularly speed of
calculation and available memory.
While the supercomputers in the early days (around 1970) used only a few
processors, in the 1990s machines with thousands of processors began to
appear and, by the end of the 20th century, massively parallel
supercomputers with tens of thousands of "off-the-shelf" processors were
the norm. A large number of dedicated processors are placed in close
proximity to each other in a computer cluster.
A computer cluster consists of a set of loosely or tightly connected computers that work
together so that in many respects they can be viewed as a single system.
The components of a cluster are usually connected to each other through
fast local area networks ("LAN") with each node (computer used as a
server) running its own instance of an operating system. Computer
clusters emerged as a result of convergence of a number of computing
trends including the availability of low cost microprocessors,
high-speed networks, and software for high performance distributed
computing.
Compute clusters are usually deployed to improve performance and
availability over that of a single computer, while typically being more
cost-effective than single computers of comparable speed or
availability.
Supercomputers play an important role in the field of computational
science, and are used for a wide range of computationally intensive
tasks in various fields, including quantum mechanics, weather
forecasting, climate research, oil and gas exploration, molecular
modelling (computing the structures and properties of chemical
compounds, biological macromolecules, polymers, and crystals), and
physical simulations (such as simulations of the early moments of the
universe, airplane and spacecraft aerodynamics, the detonation of
nuclear weapons, and nuclear fusion). [^1]
What is the HPC-UGent infrastructure?
The HPC is a collection of computers with AMD and/or Intel CPUs, running a
Linux operating system, shaped like pizza boxes and stored above and
next to each other in racks, interconnected with copper and fiber
cables. Their number crunching power is (presently) measured in hundreds
of billions of floating point operations (gigaflops) and even in
teraflops.
The HPC-UGent infrastructure relies on parallel-processing technology to offer UGent researchers an
extremely fast solution for all their data processing needs.
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"main_title": "introduction",
"subtitle": "What-is-the-HPC-UGent-infrastructure",
"source_file": "../../mkdocs/docs/HPC/introduction.md",
"title_depth": 2,
"directory": "introduction",
"parent_title": "",
"previous_title": null,
"next_title": "introduction_paragraph_2",
"OS": "generic",
"reference_link": "https://docs.hpc.ugent.be/introduction/#what-is-the-hpc-ugent-infrastructure"
}
Loading
Loading