Existing traj integration #45

thempel · 2017-04-07T17:29:23Z

This is the documentation on how to add existing trajectories. It includes a few fixes that were necessary:

For multiple engines in a pyemma analysis, file names have to be adjusted accordingly. This fix is a bit hacky.
Paths with prefix shared://are now taken as absolute paths and have to be entered as such. @jhprinz is this compatible with what you were thinking of when introducing shared://?

jhprinz · 2017-04-07T18:07:42Z

adaptivemd/scheduler.py

@@ -362,7 +362,7 @@ def replace_prefix(self, path):
        path = path.replace('sandbox://', '../..')

        # the main remote shared FS
-        path = path.replace('shared://', '../../..')
+        path = path.replace('shared://', '')


This is indeed a very bad hack, if not more a bug. I actually doubt that this will work. What it does it that everywhere, where you expect from the working directory to link to NO_BACKUP you will end up in the working directory instead. Possible that it still works because it is not used yet.

You can use worker:// instead because that is actually what worker does. Just don't alter the path and if you use an absolute path then this works.

worker:///this/is/an/absolute/path/traj.dcd`

note the 3! / in the beginning, while relative paths in the working dir start with only 2 /.

Sorry, I got the concept of shared:// wrong. Change undone and updated docs.

Just for the record: The File prefixes such as shared:// are explained in the File docs.

jhprinz · 2017-04-07T19:43:47Z

adaptivemd/analysis/pyemma/emma.py

+            trajs = list(trajectories)
+            trajectory_file_name = ty.filename
+
+
        t.call(


hmm, I see what this is necessary for: You want to use loaded and generated trajectories (so multiple engines) but assume that the generated trajectories have different names.

Your way is one way to do it and not change the _remote.py. I usually try to avoid hacks and use parameters in ways they are not supposed to (even if it works). That works now, but when someone changes the _remote.py and does not know your hack, it might fail. E.g. a (resonable) check that the outtype is not empty.

I am also not sure if we need to make more restrictions on the output types in a project: Like now you assume, that if I use protein in the analysis that the engine from all trajs have the same idea of protein. I think that makes sense, but we could check, if stride is the same and selection, too. Filename can be different of course.

Btw. getting list of unique Engine objects is easiest to get using set

engines = set(traj.engine for traj in trajectories)

What about just changing _remote to accept directly the full path, like in your multi engine approach, just not to pass the additional traj_name parameter. I think I did this exactly to not create a second list, but why not.

jhprinz · 2017-04-07T19:45:31Z

Instead of using worker:// you could also add a new prefix like root:// which does exactly what you wanted. Just that in theory, all you can ever use is the shared folder and upwards. All other folders are not accessible from all nodes (at least that is the assumption for shared)

thempel · 2017-04-07T21:03:36Z

You are absolutely right. Paths are now directly passed with file names from PyEMMAAnalysis to _remote. So far, it seems to work fine.

thempel added 3 commits April 7, 2017 12:06

[docs] added misc section on existing traj import

2a2bcf8

[analysis.pyemma] added multiple engine support

b175d1d

[scheduler] changed shared:// paths to absolute

d94d9d2

jhprinz reviewed Apr 7, 2017

View reviewed changes

[scheduler] undo shared:// modification & update docs

71a1b0c

jhprinz reviewed Apr 7, 2017

View reviewed changes

thempel added 4 commits April 7, 2017 14:57

[analysis.pyemma] multi engine support, pass paths directly

bfae29a

[analysis.pyemma] multi engines: check strides & selections

1cb9f86

[analysis.pyemma] minor bugfix

ed6365e

[analysis.pyemma] fix file paths

f72f3d6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Existing traj integration #45

Existing traj integration #45

thempel commented Apr 7, 2017

jhprinz Apr 7, 2017

thempel Apr 7, 2017

jhprinz Apr 7, 2017

jhprinz commented Apr 7, 2017

thempel commented Apr 7, 2017

Existing traj integration #45

Are you sure you want to change the base?

Existing traj integration #45

Conversation

thempel commented Apr 7, 2017

jhprinz Apr 7, 2017

Choose a reason for hiding this comment

thempel Apr 7, 2017

Choose a reason for hiding this comment

jhprinz Apr 7, 2017

Choose a reason for hiding this comment

jhprinz commented Apr 7, 2017

thempel commented Apr 7, 2017