Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: use built-in types #262

Merged
merged 4 commits into from
Nov 18, 2024
Merged

chore: use built-in types #262

merged 4 commits into from
Nov 18, 2024

Conversation

thomaschhh
Copy link
Member

What does this PR do?

This PR uses built-in types instead of using the ones from the typing module.
It also implements the adherence to PEP 604.

General Changes

  • ..

Breaking Changes

  • ..

Checklist before submitting final PR

  • My PR is minimal and addresses one issue in isolation
  • I have merged the latest version of the target branch into this feature branch
  • I have reviewed my own code w.r.t. correct implementation, missing type hints, proper documentation, etc.
  • I have run a sample config for model training
  • I have checked that all tests run through (python tests/tests.py)
  • I have updated the internal changelog (CHANGELOG_DEV.md)

Copy link
Member

@flxst flxst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :) I found 1-2 minor issues and left some comments.

src/modalities/config/component_factory.py Outdated Show resolved Hide resolved
src/modalities/dataloader/dataloader.py Outdated Show resolved Hide resolved
src/modalities/dataloader/dataloader.py Outdated Show resolved Hide resolved
@thomaschhh
Copy link
Member Author

When running the test suite there are a few things that stand out:

Traceback (most recent call last):
  File "/home/operation/modalities_old/tests/tests.py", line 136, in <module>
    main(**args)
  File "/home/operation/modalities_old/tests/tests.py", line 113, in main
    assert isfile(
AssertionError: ERROR! /home/operation/modalities_old/examples/getting_started/run_getting_started_example.sh does not exist.

PR #265 fixes this issue.

However, the getting_started_example is not working

=== RUN GETTING STARTED EXAMPLE ===
cd /home/operation/modalities_old/tutorials/getting_started; bash run_getting_started_example.sh 0 1
run getting_started_examples on CUDA_VISIBLE_DEVICES=0,1
/home/operation/miniconda3/envs/modalities2/lib/python3.10/site-packages/pydantic/_internal/fields.py:161: UserWarning: Field "model_checkpoint_path" has conflict with protected namespace "model".

[...]

[rank1]: Traceback (most recent call last):
[rank1]: File "/home/operation/miniconda3/envs/modalities2/bin/modalities", line 8, in
[rank1]: sys.exit(main())
[rank1]: File "/home/operation/miniconda3/envs/modalities2/lib/python3.10/site-packages/click/core.py", line 1157, in call
[rank1]: return self.main(*args, **kwargs)
[rank1]: File "/home/operation/miniconda3/envs/modalities2/lib/python3.10/site-packages/click/core.py", line 1078, in main
[rank1]: rv = self.invoke(ctx)
[rank1]: File "/home/operation/miniconda3/envs/modalities2/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
[rank1]: return _process_result(sub_ctx.command.invoke(sub_ctx))
[rank1]: File "/home/operation/miniconda3/envs/modalities2/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
[rank1]: return ctx.invoke(self.callback, **ctx.params)
[rank1]: File "/home/operation/miniconda3/envs/modalities2/lib/python3.10/site-packages/click/core.py", line 783, in invoke
[rank1]: return __callback(*args, **kwargs)
[rank1]: File "/home/operation/modalities_old/src/modalities/main.py", line 60, in entry_point_run_modalities
[rank1]: with CudaEnv(process_group_backend=ProcessGroupBackendType.nccl):
[rank1]: File "/home/operation/modalities_old/src/modalities/running_env/cuda_env.py", line 33, in enter
[rank1]: torch.cuda.set_device(self.local_rank)
[rank1]: File "/home/operation/miniconda3/envs/modalities2/lib/python3.10/site-packages/torch/cuda/init.py", line 420, in set_device
[rank1]: torch._C._cuda_setDevice(device)
[rank1]: RuntimeError: CUDA error: invalid device ordinal
[rank1]: CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
[rank1]: For debugging consider passing CUDA_LAUNCH_BLOCKING=1
[rank1]: Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Traceback (most recent call last):
File "/home/operation/modalities_old/tests/tests.py", line 136, in
main(**args)
File "/home/operation/modalities_old/tests/tests.py", line 123, in main
check_existence_and_clear_getting_started_example_output(run_getting_started_example_directory, date_of_run)
File "/home/operation/modalities_old/tests/tests.py", line 43, in check_existence_and_clear_getting_started_example_output
assert checkpoint_to_delete is not None, f"ERROR! could not find a checkpoint with datetime > {date_of_run}"
AssertionError: ERROR! could not find a checkpoint with datetime > 2024-11-04__11-37-23

@flxst
Copy link
Member

flxst commented Nov 6, 2024

@thomaschhh I wasn't able to reproduce your error. All tests including the getting started example seem to run through using python tests/tests.py --multi-gpu. Could you try again with the latest commit (I merged main into type-annotations)? If the problem persists, feel free to open a new issue as the problem seems to be unrelated to this PR.

@thomaschhh
Copy link
Member Author

thomaschhh commented Nov 18, 2024

I am still getting the same error, however, I cannot test the multi-gpu scenario because I only have access to one and not multiple gpus. I guess that is where the error stems from.

@flxst
Copy link
Member

flxst commented Nov 18, 2024

Yes, that might be what causes the problem. We do not explicitly check that there are 2 GPUs available if you run python tests/tests.py --multi-gpu --devices 0,1, it is assumed.

@thomaschhh
Copy link
Member Author

What do you think, can we merge then? I mean if you have access to multiple GPUs and the tests all pass on your end then we should be good, right?

@flxst
Copy link
Member

flxst commented Nov 18, 2024

Yes, we should perhaps merge the latest main into this branch first, double-check that are no new "typing mistakes", and potentially get a quick second review from e.g. @le1nux?

@thomaschhh
Copy link
Member Author

I just merged main into this feature branch and also saw there that are no occurrences of Dict, List, etc.

Copy link
Member

@le1nux le1nux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :)

@le1nux le1nux merged commit c44ec59 into main Nov 18, 2024
3 checks passed
@le1nux le1nux deleted the type-annotations branch November 18, 2024 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Annotations: Adhere to PEP 604 / Leverage Built-In 3.10 Functionality
3 participants