Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestions to ingest data automatically #69

Open
jaimergp opened this issue Apr 20, 2020 · 2 comments
Open

Suggestions to ingest data automatically #69

jaimergp opened this issue Apr 20, 2020 · 2 comments

Comments

@jaimergp
Copy link

@apayne97, @henriberger and I have been talking about solutions to incorporate information from the Thorne Lab in a more automated way. We have come with this "ideal" pipeline:

Tier 1) Create a script that can diff their PDB IDs with our PDB IDs. Report the set difference for a human to review which new ones are worth adding.

Tier 2) Create a GitHub Actions pipeline that does this automatically either with an hourly cronjob or, if technically possible, after every push to the Thorne Lab repo

Tier 3) Add bot features to GHA to submit the PRs needed for each new candidate PDB ID. A human reviews it, editing the information as needed, and merges or rejects it. The closed PRs serve as a history on what we have tried so we don't resubmit twice.

Let us know if you have feedback!

@Lnaden
Copy link
Collaborator

Lnaden commented Apr 20, 2020

I like this idea. The first one would not be too hard to do. The second one I would want to be careful about due to the possibility of pinging people watching this repo every time it makes a PR, but could be done relatively easily. Same concern with the 3rd, but I don't think I see the difference between 2 and 3, could you elaborate?

@jaimergp
Copy link
Author

Option (2) only notifies a selected pool of users, say by writing a comment on a specific issue.

Option (3) would create the adequate PRs (one per PDB id?), with an automatically generated file template filled by the new information upstream.

About the notification noise... I guess we can have a fork of this repo somewhere else where those branches are created and then it's up to the human(s) to create the PR or not? I am not really sure if I like that though... I am inclined to say I am not.

I don't know if there are API ways to selectively notify only some people, but if you are subscribed to this repo, you'll get everything anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants