Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to list glacierefied folders #30

Open
chuwy opened this issue Jun 8, 2017 · 2 comments
Open

Add option to list glacierefied folders #30

chuwy opened this issue Jun 8, 2017 · 2 comments

Comments

@chuwy
Copy link
Contributor

chuwy commented Jun 8, 2017

In list_runids function we explicitly skip run ids that are archived on AWS Glacier, which means they will never appear in run manifest. I believe this is wrong solution as customer can restore particular folder without intention to reprocess it (let's say with PySpark job).

I propose to:

  • Add option list_runids(include_archived=False) to list folders archived to AWS Glacier
  • Make list_runids()return objects of RunId class (rather than plain string) that can hold information whether folder is archived
  • Add third possible state to Add state to run manifests #29 to mark RunId processing was explicitly cancelled. Something like CancelledAt.

This should make run manifests feature able to take full control over data processing.

TODO: Think what to do with another storage classes. Currently we list folders that have only STANDARD class.

@alexanderdean
Copy link
Member

I think this makes sense. I would probably call the state IgnoredAt, rather than CancelledAt?

@chuwy
Copy link
Contributor Author

chuwy commented Jun 8, 2017

Agree about IgnoredAt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants