-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Existing traj integration #45
base: master
Are you sure you want to change the base?
Conversation
adaptivemd/scheduler.py
Outdated
@@ -362,7 +362,7 @@ def replace_prefix(self, path): | |||
path = path.replace('sandbox://', '../..') | |||
|
|||
# the main remote shared FS | |||
path = path.replace('shared://', '../../..') | |||
path = path.replace('shared://', '') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is indeed a very bad hack, if not more a bug. I actually doubt that this will work. What it does it that everywhere, where you expect from the working directory to link to NO_BACKUP
you will end up in the working directory instead. Possible that it still works because it is not used yet.
You can use worker://
instead because that is actually what worker does. Just don't alter the path and if you use an absolute path then this works.
worker:///this/is/an/absolute/path/traj.dcd`
note the 3! /
in the beginning, while relative paths in the working dir start with only 2 /
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I got the concept of shared://
wrong. Change undone and updated docs.
Just for the record: The File
prefixes such as shared://
are explained in the File
docs.
adaptivemd/analysis/pyemma/emma.py
Outdated
trajs = list(trajectories) | ||
trajectory_file_name = ty.filename | ||
|
||
|
||
t.call( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, I see what this is necessary for: You want to use loaded and generated trajectories (so multiple engines) but assume that the generated trajectories have different names.
Your way is one way to do it and not change the _remote.py
. I usually try to avoid hacks and use parameters in ways they are not supposed to (even if it works). That works now, but when someone changes the _remote.py
and does not know your hack, it might fail. E.g. a (resonable) check that the outtype is not empty.
I am also not sure if we need to make more restrictions on the output types in a project: Like now you assume, that if I use protein
in the analysis that the engine from all trajs have the same idea of protein
. I think that makes sense, but we could check, if stride
is the same and selection
, too. Filename can be different of course.
Btw. getting list of unique Engine
objects is easiest to get using set
engines = set(traj.engine for traj in trajectories)
What about just changing _remote
to accept directly the full path, like in your multi engine approach, just not to pass the additional traj_name parameter. I think I did this exactly to not create a second list, but why not.
Instead of using |
You are absolutely right. Paths are now directly passed with file names from |
This is the documentation on how to add existing trajectories. It includes a few fixes that were necessary:
shared://
are now taken as absolute paths and have to be entered as such. @jhprinz is this compatible with what you were thinking of when introducingshared://
?