Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout from globus for large submissions causes submissions to not publish #192

Open
jdhayhurst opened this issue Nov 29, 2021 · 2 comments

Comments

@jdhayhurst
Copy link
Collaborator

On the "publish sumstats" method, Globus is called to do the file transfer. However, a timeout from globus is likely when the number of files in the globus dir are in the thousands. This prevents us from being able to get the file names and subsequently, files are not transferred to the staging directory.
Globus error: 502, 'ExternalError.DirListingFailed.Timeout
stacktrace from python:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/celery/app/trace.py", line 385, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/celery/app/trace.py", line 648, in __protected_call__
    return self.run(*args, **kwargs)
  File "/sumstats_service/sumstats_service/app.py", line 177, in publish_sumstats
    au.publish_sumstats(resp)
  File "/sumstats_service/sumstats_service/resources/api_utils.py", line 278, in publish_sumstats
    study.move_file_to_staging()
  File "/sumstats_service/sumstats_service/resources/study_service.py", line 225, in move_file_to_staging
    return ssf.move_file_to_staging()
  File "/sumstats_service/sumstats_service/resources/file_handler.py", line 283, in move_file_to_staging
    dest_file = os.path.join(dest_dir, self.staging_file_name + ext)
TypeError: must be str, not NoneType
@ljwh2
Copy link

ljwh2 commented Jan 11, 2022

Is this still an issue?

@jdhayhurst
Copy link
Collaborator Author

globus timeouts could still be an issue. One approach would be to allow nested structures i.e. subfolders and request/impose that users can't put more than X number of files in a single folder. This is generally good practice anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants