Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Branch-Protection check should include Repository Rules #3326

Closed
thepwagner opened this issue Jul 27, 2023 · 12 comments · Fixed by #3354
Closed

Feature: Branch-Protection check should include Repository Rules #3326

thepwagner opened this issue Jul 27, 2023 · 12 comments · Fixed by #3354
Labels

Comments

@thepwagner
Copy link
Contributor

Is your feature request related to a problem? Please describe.

GitHub has recently released Repository Rules as an alternative to Branch Protection rules.
These rules can be queried without administrator access, so Scorecard can determine if a third party repository is branch-protected, and ossf/scorecard-action can produce better results with default tokens.

Describe the solution you'd like

The Branch Protection checks should be modified to fetch and score protections provided through Repository Rules in addition to Branch Protection rules. Support was added to go-github in google/go-github#2789 .

Describe alternatives you've considered

  • Wait: the feature is very new, and so unlikely to be well adopted.
  • Continue using only branch protection rules: this will produce false negatives for repositories using Repository Rules.

Additional context

@thepwagner thepwagner added the kind/enhancement New feature or request label Jul 27, 2023
@thepwagner thepwagner changed the title Feature Branch-Protection consider Repository Rules Feature: Branch-Protection check should include Repository Rules Jul 27, 2023
@spencerschrock
Copy link
Member

This sounds like a great enhancement to me, especially given the token discussions we've had in ossf/scorecard-webapp#444. We can augment the existing check to reward branch protection or repository rules.

Are you interested in contributing? If it's just a feature request, I can see this being beneficial enough for a maintainer to pick up as well.

@spencerschrock
Copy link
Member

Initial thoughts from the documentation:

  • Organizations can have rules, which may be something we need to check for.

Quickly set up rulesets at the organization level to target multiple repositories in your organization"

  • We need to watch for active rules

"Rulesets have statuses, so you can easily manage which rulesets are active"

  • Rules can support layering. And work with branch protection. So we'll need to read into layering, but it sounds like it will be the most restrictive setting, which is probably easier for us.

If the same rule is defined in different ways across the aggregated rulesets, the most restrictive version of the rule applies

  • There's more bypass settings

"users with a certain role, such as repository administrator, or it can be specific teams or GitHub Apps"

The applicable rules themselves seem pretty 1:1 with branch protection settings, so hopefully this is a straightforward graphQL query change. But there is at least one new rules:

@thepwagner
Copy link
Contributor Author

Organizations can have rules, which may be something we need to check for.

From poking the API, I think Organization-level rules are included in the results when querying a repository. In the GraphQL API the field is RepositoryRule.source.

Also worth a shout: this feature is pay-walled: Organization Rulesets are only available with GitHub Enterprise. Upgrade your account to activate this ruleset.

hopefully this is a straightforward graphQL query change

I also had this hope and started down this path, something like:

{
  repository(owner: "laurentsimon", name: "test3") {
    ...
    rulesets(first: 100) {
      edges {
        node {
          name
          enforcement
          pullRequestRules: rules(first: 100, type: PULL_REQUEST) {
            edges {
              node {
                parameters {
                  ... on PullRequestParameters {
                    dismissStaleReviewsOnPush
                    requireCodeOwnerReview
                    requireLastPushApproval
                    requiredApprovingReviewCount
                    requiredReviewThreadResolution
                  }
                }
              }
            }
          }
        }
      }
    }
}

My code has a condition {} element as a sibling to enforcement, but I didn't backport it to the sample query. I didn't see a way to have GitHub perform branch matching with the V4/GraphQL API, like is possible from the V3 API.

I saw a couple of paths:

I have some non-functional code attempting the latter. I spent the most time stuck wondering if the rulesets should be fetched as part of branchData AND defaultBranchData, or as a new separate+cached query (especially because I intended to fetch every rule every time).

Are you interested in contributing? If it's just a feature request, I can see this being beneficial enough for a maintainer to pick up as well.

Yep! Per the above I'd appreciate some feedback on the approach, but I'm happy to turn this into a PR if it's as simple as we currently think it may be 🤞😆 . I'm also not offended if anybody else wants to!
For context: my previous flurry of activity was part of a sponsored "hack days" event that has concluded, progress will be a little slower.

@spencerschrock
Copy link
Member

Organizations can have rules, which may be something we need to check for.

From poking the API, I think Organization-level rules are included in the results when querying a repository. In the GraphQL API the field is RepositoryRule.source.

Yes, I beleive that's true, from looking at the graphql object docs:

Include rulesets configured at higher levels that apply to this repository.

The default value is true.

hopefully this is a straightforward graphQL query change

I also had this hope and started down this path, something like:
[removed for length]

I think something like that works (I hadn't seen how to do parameters before in graphQL like that, neat). We may care about some of the other parameters too, such as bypassActors.

My code has a condition {} element as a sibling to enforcement, but I didn't backport it to the sample query. I didn't see a way to have GitHub perform branch matching with the V4/GraphQL API, like is possible from the V3 API.

I saw a couple of paths:

  • Call the V3 client for the O(n) branches returned from the GraphQL queries (hence, 🌱 Bump github.com/google/go-github from v38.1.0 to v53.2.0 #3327)
  • Fetch every repository rule in the GraphQL queries, and perform branch matching locally
    I have some non-functional code attempting the latter. I spent the most time stuck wondering if the rulesets should be fetched as part of branchData AND defaultBranchData, or as a new separate+cached query (especially because I intended to fetch every rule every time).

A couple thoughts:

  • The rest API certainly makes it easier. I'd like to avoid it if possible to keep API usage down, but if it's necessary we may have to.
  • How are you doing the matching? My GraphQL queries dont seem to be returning the patterns used to target branches.
  • re which query: i think the answer is not branchData. One problem we've observed in the past is when some data in the query is restricted (e.g. "Resource not accessible by integration"), then the whole query returns unsuccessful. So if we group it with branchData, we don't benefit from the fact that we don't need to be able to read branch protections.

Are you interested in contributing? If it's just a feature request, I can see this being beneficial enough for a maintainer to pick up as well.

Yep! Per the above I'd appreciate some feedback on the approach, but I'm happy to turn this into a PR if it's as simple as we currently think it may be 🤞😆 . I'm also not offended if anybody else wants to! For context: my previous flurry of activity was part of a sponsored "hack days" event that has concluded, progress will be a little slower.

Understood.

@spencerschrock
Copy link
Member

spencerschrock commented Aug 1, 2023

  • How are you doing the matching? My GraphQL queries dont seem to be returning the patterns used to target branches.

Ah I see, this is the conditions bit you were talking about:

conditions{
            refName{
              include
              exclude
            }
}

It would be nice if people just used the conventions of All and Default, which seem to return values of:

                  "include": [
                    "~ALL",
                    "~DEFAULT_BRANCH"
                  ]

But we may need to do some parsing for the releases/**/*, or manually repos that manually specify the branch name instead of "~DEFAULT_BRANCH"

@thepwagner
Copy link
Contributor Author

The rest API certainly makes it easier. I'd like to avoid it if possible to keep API usage down, but if it's necessary we may have to.

Agreed! I tried in main...thepwagner:scorecard:rulesets-via-v3-api and got to the point where I was using the rule sets to modify the *clients.BranchRef before returning. I can open as draft if you'd like to discuss details, but per the "prefer GraphQL" feedback maybe that branch should be abandoned (or more likely: scavenged for parts).

How are you doing the matching? Ah I see, this is the conditions bit you were talking about:

Sorry for the initial ambiguity, in main...thepwagner:scorecard:rulesets-via-v4-api I've synchronized the mock query in the comment with my GraphQL query to fetch the rules. That might be worth opening as a draft and refining?

The approach is to fetch every ruleSet of the repository during setup(), then use rulesMatchingBranch() to filter down to applicable rules when asked about a specific branch.

I got to the point of implementing rulesMatchingBranch() and handling things like "~DEFAULT_BRANCH", but the best reference I could find was https://docs.github.com/en/enterprise-cloud@latest/repositories/configuring-branches-and-merges-in-your-repository/managing-rulesets/managing-rulesets-for-a-repository#using-fnmatch-syntax - it explains the globbing but not magic values like ~DEFAULT_BRANCH. I didn't like trying to reverse engineer a blackbox (e.g. I just learned "~ALL" exists 😆 ), so I pivoted to the v3 approach. From the triple-nested .conditions.repository_name_and_ref_name .ref_name.include description on https://docs.github.com/en/rest/orgs/rules?apiVersion=2022-11-28#create-an-organization-repository-ruleset , maybe "~ALL" and "~DEFAULT_BRANCH" are the only magic values to handle.

@spencerschrock
Copy link
Member

The approach is to fetch every ruleSet of the repository during setup(), then use rulesMatchingBranch() to filter down to applicable rules when asked about a specific branch.

I got to the point of implementing rulesMatchingBranch() and handling things like "~DEFAULT_BRANCH", but the best reference I could find was https://docs.github.com/en/enterprise-cloud@latest/repositories/configuring-branches-and-merges-in-your-repository/managing-rulesets/managing-rulesets-for-a-repository#using-fnmatch-syntax - it explains the globbing but not magic values like ~DEFAULT_BRANCH. I didn't like trying to reverse engineer a blackbox (e.g. I just learned "~ALL" exists 😆 ), so I pivoted to the v3 approach. From the triple-nested .conditions.repository_name_and_ref_name .ref_name.include description on https://docs.github.com/en/rest/orgs/rules?apiVersion=2022-11-28#create-an-organization-repository-ruleset , maybe "~ALL" and "~DEFAULT_BRANCH" are the only magic values to handle.

Specifically, "~ALL" and "~DEFAULT_BRANCH", seem to only be applicable to include not to exclude

The approach is to fetch every ruleSet of the repository during setup(), then use rulesMatchingBranch() to filter down to applicable rules when asked about a specific branch.

I think this approach sounds good to me, there are some specific critiques I have that are better suited for a draft PR review. I think the approach of v3 vs v4 is the pain of doing fnmatch ourselves. I see a lot of "this is equivalent to X in regexp", so I wonder if we can do a quick and dirty conversion to come up with a regexp equivalent.

@spencerschrock
Copy link
Member

spencerschrock commented Aug 4, 2023

I see a lot of "this is equivalent to X in regexp", so I wonder if we can do a quick and dirty conversion to come up with a regexp equivalent.

I took a stab at this and threw it in a branch.

main...spencerschrock:scorecard:exp/fnmatch, or the commit permalink, 086a66e

There is one corner case that has to do with FNM_PATHNAME, but I think what I have is good enough for expected usecases:

If this flag is set, match a slash in string only with a
slash in pattern and not by an asterisk (*) or a question
mark (?) metacharacter, nor by a bracket expression ([])
containing a slash.
So in Ruby, this would be false, but my attempt would return true.

File.fnmatch('[/]', '/', File::FNM_PATHNAME) #=> false

@thepwagner
Copy link
Contributor Author

I took a stab at this and threw it in a branch.

Thanks, I cherry-picked that into my v4-client branch and hijacked the lessons learned from the v3-client branch (e.g. recovering from the GraphQL permissions error querying for branch protection rules).

I've opened a draft PR here #3354 , and noted my biggest concern in the description: things get weird when repository rules are layered on top of branch protection rules, or on top of other repository rules.

@ianlewis
Copy link

ianlewis commented Sep 3, 2023

FWIW Go's stdlib filepath package has a basic fnmatch implementation in filepath.Match so you may not need to roll your own.

@spencerschrock

@spencerschrock
Copy link
Member

FWIW Go's stdlib filepath package has a basic fnmatch implementation in filepath.Match so you may not need to roll your own.

@spencerschrock

Sadly this doesn't match a common pattern, one of GitHub's suggested ones (playground):

pattern := "releases/**/*"
branch := "releases/v2"
match, err := filepath.Match(pattern, branch)
fmt.Println(match, err)
false <nil>

@ianlewis
Copy link

ianlewis commented Sep 6, 2023

Right. hmm. Fair enough. Seems like that it's not actually an fnmatch compatible pattern match but a different glob-like pattern since many(most?) fnmatch implementations don't support **

Seems like some folks have run into this before as well.
https://github.com/bmatcuk/doublestar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants