- Exercise 5: Image classification
- Exercise 6: Text categorization
- Exercise 7: Text generation
- Exercise 8: Using multiple GPUs
-
Login to LUMI using either:
- the web user interface at https://www.lumi.csc.fi/ ("Go to login") and start "Login node shell", or
- login with your username and SSH key to
lumi.csc.fi
, for more instructions see: https://docs.lumi-supercomputer.eu/firststeps/
-
In the login node shell, or SSH session, set up the module environment for using PyTorch:
module purge module use /appl/local/csc/modulefiles/ module load pytorch
-
Go to the exercise directory:
- if you ran the exercises of day 1 using LUMI's "Jupyter for courses", you should already have the repository cloned in your home directory
cd PDL-2024-11/intro-to-dl/day2
If you don't have it, you can also clone it yourself:
mkdir PDL-2024-11 cd PDL-2024-11 git clone https://github.com/csc-training/intro-to-dl cd intro-to-dl/day2
-
Edit Python script, either by:
- Navigating to the file in the LUMI web UI file browser (Files → Home Directory → PDL-2024-11 → intro-to-dl → day2) and selecting "Edit" on that file (under the three dots "⋮" menu).
- Opening with your favorite text editor in the terminal, for example:
nano pytorch_test.py
-
Submit job:
sbatch run.sh pytorch_test.py
-
See the status of your jobs or the queue you are using:
squeue --me squeue -p small-g
-
After the job has finished, examine the results:
less slurm-xxxxxxxx.out
-
Go to 1 until you are happy with the results.
You can use TensorBoard either via the LUMI web user interface (recommended), or via the terminal using ssh port forwarding. Both approaches are explained below.
- Log in via https://www.lumi.csc.fi/
- Select menu item: Apps → TensorBoard
- In the form:
- Select course project: project_462000699
- Specify the "TensorBoard log directory", it's where you have cloned the course repository plus "day2/logs", for example:
~/PDL-2024-11/intro-to-dl/day2/logs
. You can runpwd
in the terminal to find out the full path where you are working. - Leave rest at default settings
- Click "Launch"
- Wait until you see the "Connect to Tensorboard" button, then click that.
- When you're done using TensorBoard, please go to "My Interactive Sessions" in the Puhti web user interface and "Cancel" the session. (It will automatically terminate once the reserved time is up, but it's always better to release the resource as soon as possible so that others can use it.)
-
Login again in a second terminal window to LUMI with SSH port forwarding:
ssh -L PORT:localhost:PORT lumi.csc.fi
Replace
PORT
with a freely selectable port number (>1023). By default, TensorBoard uses the port 6006, but select a different port to avoid overlaps. -
Set up the module environment and start the TensorBoard server:
module purge module use use /appl/local/csc/modulefiles/ module load tensorflow singularity_wrapper exec tensorboard --logdir=PDL-2024-11/intro-to-dl/day2/logs --port=PORT --bind_all
-
To access TensorBoard, point your web browser to localhost:PORT .