You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When running distributed completion service runs against a data center with Minio S3 emulation in it, I must specify my file endpoints with s3://<host>:<port>/<bucket>/<path>
otherwise nothing works.
When running a similar run locally, I need to instead specify my file endpoints with http://<host>/<bucket>/<path>
There are really 2 problems here that have to do with consistency of file specification between the local and distributed worker execution in the studio python lib:
When doing local runs (everything entirely within the studioml python library), I cannot specify the s3:// -style file paths, and thus, I am not able to use the same Minio S3 emulation for my local runs at all. What you get is an exception ending with:
File "/home/danfink/venv/enn-3.6/lib/python3.6/site-packages/studio/util.py", line 333, in _get_active_s3_client
raise NotImplementedError("Artifact store is not set up or has the wrong type")
When doing distributed runs in a data center (using Minio and GoRunner), I cannot specify the http:// -style file paths at all to point at real S3 to get data when the studio database/storage requires the AWS creds of Minio, even when the http:// -style file ref points to something on S3 that requires no credentials to access. (e.g. I can wget it just fine, but somehow GoRunner doesn't like it within the context of getting a file for running a job within a data center)
Ideally, what I am really looking for is consistency between the two types of runs.
I would love to be able to set my files to be either s3:// style to access the data center
or http:// to access real S3 -- set it and forget it, but still be able to switch back and forth between local and distributed workers.
Pure conjecture as to some potential hurdles to this:
the studioml python libs that handle the local workers probably just does not know how to deal with s3://-style file references at all which would aid in credentialed access to S3/Minio
GoRunner for distributed execution might need to make a distinction between s3:// and http:// where s3:// is maybe always something that uses an AWS API and http//: is maybe something that never really requires credentials?
It's possible that the current conventions for GoRunner data centers do not allow firewall holes for regular S3 access, even for plain-old http/wget access for open-access buckets
There is nothing within the completion file dictionary spec that allows for multiple credentials (separate issue filed for that)
Feel free to split this up into multiple issues if that is the best way to attack the problems listed here, but again, what I am really looking for is file-specification consistency between the two main modes of running completion service jobs.
It's also possible that #381 addresses part of this, but in my recent testing with studioml==0.0.15, it's not all there yet.
The text was updated successfully, but these errors were encountered:
When running distributed completion service runs against a data center with Minio S3 emulation in it, I must specify my file endpoints with
s3://<host>:<port>/<bucket>/<path>
otherwise nothing works.
When running a similar run locally, I need to instead specify my file endpoints with
http://<host>/<bucket>/<path>
There are really 2 problems here that have to do with consistency of file specification between the local and distributed worker execution in the studio python lib:
Ideally, what I am really looking for is consistency between the two types of runs.
I would love to be able to set my files to be either s3:// style to access the data center
or http:// to access real S3 -- set it and forget it, but still be able to switch back and forth between local and distributed workers.
Pure conjecture as to some potential hurdles to this:
Feel free to split this up into multiple issues if that is the best way to attack the problems listed here, but again, what I am really looking for is file-specification consistency between the two main modes of running completion service jobs.
It's also possible that #381 addresses part of this, but in my recent testing with studioml==0.0.15, it's not all there yet.
The text was updated successfully, but these errors were encountered: