Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compute selection: deviceIndex & enforce 1 thread in vacuum #752

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

IAlibay
Copy link
Member

@IAlibay IAlibay commented Mar 4, 2024

Fixes #739 #704

Checklist

  • Added a news entry

Developers certificate of origin

@pep8speaks
Copy link

pep8speaks commented Mar 4, 2024

Hello @IAlibay! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 195:80: E501 line too long (93 > 79 characters)
Line 739:80: E501 line too long (81 > 79 characters)

Line 620:80: E501 line too long (81 > 79 characters)
Line 623:80: E501 line too long (80 > 79 characters)

Line 935:80: E501 line too long (81 > 79 characters)
Line 938:80: E501 line too long (80 > 79 characters)

Line 30:80: E501 line too long (136 > 79 characters)

Line 178:80: E501 line too long (132 > 79 characters)

Comment last updated at 2024-07-04 00:06:38 UTC

@IAlibay IAlibay linked an issue Mar 4, 2024 that may be closed by this pull request
Copy link

codecov bot commented Mar 4, 2024

Codecov Report

Attention: Patch coverage is 83.33333% with 4 lines in your changes missing coverage. Please review.

Project coverage is 92.15%. Comparing base (ecc8a50) to head (48707e2).
Report is 120 commits behind head on main.

Files with missing lines Patch % Lines
openfe/protocols/openmm_utils/omm_compute.py 55.55% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #752      +/-   ##
==========================================
- Coverage   93.98%   92.15%   -1.84%     
==========================================
  Files         134      134              
  Lines        9790     9800      +10     
==========================================
- Hits         9201     9031     -170     
- Misses        589      769     +180     
Flag Coverage Δ
fast-tests 92.15% <83.33%> (?)
slow-tests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@IAlibay IAlibay requested a review from mikemhenry July 4, 2024 00:08
@IAlibay
Copy link
Member Author

IAlibay commented Jul 4, 2024

@mikemhenry when you get a chance please do have a look at this - I suspect it'll make life a bit easier in some cases.

String with the platform name. If None, it will use the fastest
platform supporting mixed precision.
Default ``None``.
gpu_device_index : Optional[list[str]]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we should probably have a chat about how we handle this long term - this is a bit like MPI settings, where technically we shouldn't make this immutable but maybe something we pick up at run time?

How can we go about handling this properly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something I don't think we abstracted well are "run time arguments". Like we have the split for settings that change thermo, but didn't consider a category of non-thermo settings that make the most sense to pick at runtime, I haven't looked at the code yet and will update this comment, but I suspect what we should do is

  1. have some default
  2. read this setting
  3. read in an environmental variable

If we do things in that order, it means we don't break anything old, then when configuring your system you can make some choices, but then when running on HPC you can still set things if needed and override the settings


**Changed:**

* `openfe.protocols.openmm_rfe._rfe_utils.compute` has been moved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good example of a nice changelog entry but since this was a private API, no symver major bump needed

**Changed:**

* `openfe.protocols.openmm_rfe._rfe_utils.compute` has been moved
to `openfe.protocols.openmm_utils.omm_compute`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want a private namespace or is this now in our public API?

Suggested change
to `openfe.protocols.openmm_utils.omm_compute`.
to `openfe.protocols.openmm_utils._omm_compute`.

Copy link
Contributor

@mikemhenry mikemhenry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is good, have a few notes but nothing blocking.

platform = compute.get_openmm_platform(
settings['engine_settings'].compute_platform
# Restrict CPU count if running vacuum simulation
restrict_cpu = settings['forcefield_settings'].nonbonded_method.lower() == 'nocutoff'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another argument we really should have an explicit way of saying this is a vacuum simulation. I propose we add a setting somewhere for that #904

In the meantime, this seems like a pretty good heuristic.

We could do more logging, I it would be nice to do a hackathon on it but in the mean time I will just suggest as I see it. It would be good to log what is going on here, maybe could be more verbose than what I suggest but this seems like a spot where if someone was like "why is this running on the CPU and not the GPU?" a log message could hep

Suggested change
restrict_cpu = settings['forcefield_settings'].nonbonded_method.lower() == 'nocutoff'
restrict_cpu = settings['forcefield_settings'].nonbonded_method.lower() == 'nocutoff'
logging.info(f"{restrict_cpu=}")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add cuda DeviceIndex in engine_settings Make get_openmm_platform set threads to 1 if using NoCutoff
3 participants