You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Until now, Autosubmit has been developed to work in distributed systems using a single distributed file system to persist the experiment data. Even so, we faced some issues when deploying it in multiple environments with no access to a common file system like when deploying multiple docker containers in k8s.
An issue that is raised in that context is the way to retrieve information or trigger actions of another experiment from another file system. Also, to retrieve information under this context, Autosubmit will need to constantly synchronize the experiments between the connected nodes to have unique experiment ids to enable the possibility of locating the experiment data. In the end, any solution to these previous issues will require strong security protocols to execute those sensitive procedures.
As a simpler alternative to avoid this overhead, we can integrate multiple nodes with their own unique environment (file system, Autosubmit version, API, GUI) having a higher-level API (I'll call it "API Beacon") that maps all the nodes and establishes a secure connection with them to execute common Autosubmit tasks. As an idea, this connection can be done by providing an API key generated from the Autosubmit API to the API Beacon.
Notice that in this draft design, for each node/container, there is one instance of Autosubmit, API, and GUI.
For example, the sequential steps of the run experiment use case using this strategy will be:
The user requests to run an experiment to the API beacon by giving the URI of the Autosubmit instance and the expid
The API beacon forwards this request to the Autosubmit API securely using the previously set API key and the given URI
Autosubmit API receives the request and executes autosubmit run <expid>
Notice that the "stop experiment" can be done (having the future autosubmit stop command) in the same way because the autosubmit run <expid> process will be in the same node.
@mcastril@kinow This aims to be implemented for EDITO to handle the multiple Autosubmit containers from the users, assuming SURF will need a unique interface to interact with Autosubmit experiments.
Until now, Autosubmit has been developed to work in distributed systems using a single distributed file system to persist the experiment data. Even so, we faced some issues when deploying it in multiple environments with no access to a common file system like when deploying multiple docker containers in k8s.
An issue that is raised in that context is the way to retrieve information or trigger actions of another experiment from another file system. Also, to retrieve information under this context, Autosubmit will need to constantly synchronize the experiments between the connected nodes to have unique experiment ids to enable the possibility of locating the experiment data. In the end, any solution to these previous issues will require strong security protocols to execute those sensitive procedures.
As a simpler alternative to avoid this overhead, we can integrate multiple nodes with their own unique environment (file system, Autosubmit version, API, GUI) having a higher-level API (I'll call it "API Beacon") that maps all the nodes and establishes a secure connection with them to execute common Autosubmit tasks. As an idea, this connection can be done by providing an API key generated from the Autosubmit API to the API Beacon.
Notice that in this draft design, for each node/container, there is one instance of Autosubmit, API, and GUI.
For example, the sequential steps of the run experiment use case using this strategy will be:
expid
autosubmit run <expid>
Notice that the "stop experiment" can be done (having the future
autosubmit stop
command) in the same way because theautosubmit run <expid>
process will be in the same node.@mcastril @kinow
The text was updated successfully, but these errors were encountered: