-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Handling pagination of repository vulnerabilities #85
Conversation
Codecov Report
@@ Coverage Diff @@
## main #85 +/- ##
==========================================
- Coverage 75.41% 74.07% -1.35%
==========================================
Files 15 15
Lines 716 729 +13
==========================================
Hits 540 540
- Misses 168 180 +12
- Partials 8 9 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, there's a problem with this approach. This will drastically increase the number of calls made to the GraphQL API, which will both slow us down and potentially cause rate limiting. Previously, given n = number of repositories
, we would perform n / 100
API calls. Now we perform n + 1
calls. Given a sufficiently large org, such as one containing 500 repositories, we just went from 5 calls to 501. And that's assuming we never actually paginate the vulnerabilities, which is the whole point here!
I think a better way we could accomplish this is through a combination of approaches. In other words, only perform these repo-specific queries when necessary.
Taking another look at this, I believe this could actually be somewhat easy using mostly just the pre-existing data structures. The current flow would work as-is, but then when HasNextPage
is true
for a repositories vulnerabilities, you would send a new query, utilizing the EndCursor
, which would look something like this:
type repoVulnerabilityQuery struct {
Repository orgRepo `graphql:"repository(name: $repoName, owner: $orgName)"`
}
I think that might be the only new data structure necessary.. and then we would continue to have our n / 100
queries, except in cases where vulnerabilities are paginated.
I feel like I may have rambled a bit in there, so I hope it all makes sense. Please feel free to ask if you have any questions!
Alright Joey, thanks for the feedback. I have addressed the comment, lmk what you think. It was smart thinking about the rate limiting. Will keep this in mind for the future! |
querying/github.go
Outdated
|
||
nextCursor := repositoryQuery.Repository.VulnerabilityAlerts.PageInfo.EndCursor | ||
log.Info().Any("alertCursor", queryVars["alertCursor"]).Msg("Querying for more vulnerabilities for a repository.") | ||
return gh.processRepoFindings(projects, repositoryQuery.Repository, nextCursor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have verified this locally, will add some unit test for this in a follow up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! I love the recursive approach. And this is so much easier than I had feared in my head!
I just wanted to quickly address these last couple of comments, but aside from that this looks all ready!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing! Wonderful! Love it! Thanks so much for contributing this @JoseAngel1196
Fixes #7
Description
In this PR, we're addressing pagination for repository vulnerabilities. To put it simply, imagine If we have a substantial number of vulnerabilities for a repository (which may not be ideal), we need to ensure we handle them appropriately. It's also not ideal to skip these vulnerabilities, as we are currently doing today.
I have ensured that these new changes do not break the reporting functionality. I have verified this locally using a test account.