Skip to content

5. How we use github

Juliette Regimbal edited this page Oct 2, 2024 · 1 revision

All code, and most documentation, should be in one of our github repositories. Please remember that our main repos are public!

Important:

  • NEVER check in an API/account key or password
  • NEVER check in a full machine learning model or other large file
  • NEVER check in binaries

Our use is based on github flow. Essentially, this means you will:

  • clone the repository
  • make a new branch
  • implement your changes
  • make sure you've synced from main so your branch is up-to-date
  • test your changes
  • create a pull request (PR)
  • assign the PR to the appropriate reviewer (ask if you are not sure who that is!)
  • if the reviewer assigns it back to you, make modifications, and assign PR back to reviewer
  • reviewer merges your branch to main
  • you delete the branch so it doesn't clutter up the repo

FAQ:

  • Can I just put the reviewer in the "Reviewer" field for my PR? ANSWER: No. Reviewers won't look at items that are not currently assigned to them.
  • Where should I put my big items like ML model files, if not in the git repo? ANSWER: We have a directory on the IMAGE SharePoint that we copy to unicorn and pegasus.
  • Why SharePoint? Why not git annex, git LFS, etc.? ANSWER: We considered some of these, but they either had limitations, or we weren't sure how to set them up effectively for our use. If you're interested in taking this on, please speak up, since the cloud drive solution is not ideal for many reasons.
  • If I have documentation or instructions, can I just keep the information in the work item, then close it when I'm done? ANSWER: No, closed items are hard to find. Any ongoing documentation for a specific component should be in the README.md for the narrowest area possible. For example, research relating to machine learning models used in a preprocessor should be in the README.md for that specific preprocessor.
  • Should I fork any of our repos? ANSWER: No, you should clone.