Use Github Throttling pluging for @octokit/rest #3983

andrewdibiasio6 · 2024-07-11T17:24:36Z

GitHub limits the number of REST API requests that you can make within a specific amount of time.

We authorize a GitHub App or OAuth app, which can then make API requests on your behalf. All of these requests count towards a personal rate limit of 5,000 requests per hour.

In addition to primary rate limits, GitHub enforces secondary rate limits in order to prevent abuse and keep the API available for all users.

We may encounter a secondary rate limit if we:

Make too many concurrent requests. No more than 100 concurrent requests are allowed.
Make too many requests to a single endpoint per minute. No more than 900 [points](https://docs.github.com/en/rest/using-the-rest-api/rate-limits-for-the-rest-api?apiVersion=2022-11-28#calculating-points-for-the-secondary-rate-limit) per minute are allowed for REST API endpoints
Make too many requests per minute. No more than 90 seconds of CPU time per 60 seconds of real time is allowed.

We are seeing many errors like:

{"level":"ERROR","message":"Request failed with status code 403","service":"runners-pool","timestamp":"2024-07-11T14:10:21.576Z",

{"level":"WARN","message":"Ignoring error: Request failed with status code 503","service":"runners-scale-up","timestamp":"2024-07-22T18:32:03.979Z","xray_trace_id":"1-669ea597-fcd409efe9e843d7a70dc3d6" ... }

I suggest we add the suggested throttling plugin to help with this issue, or some other suggestion here.

The text was updated successfully, but these errors were encountered:

npalm · 2024-07-12T13:11:00Z

Would be a good addition. But it will not solve the rate limit problem. May I ask what size of org / deployments do you have?

andrewdibiasio6 · 2024-07-23T13:54:58Z

But it will not solve the rate limit problem.

@npalm I was able to solve most of the rate limiting problems that were occurring almost ever hour by varying the scheduled lambda event. The pool docs suggest the following: schedule_expression = "cron(* * * * ? *)" . This, in my opinion, is not a great suggestion. It will almost certainly result in rate limiting if you have more than a couple runner configurations because github will throttle you due to concurrent requests. To resolve this, I first varied the schedule expressions across runners like so:
schedule_expression: cron(0/2 * * * ? *) schedule_expression: cron(1/2 * * * ? *)

This reduces the overall concurrent requests to github, and resolves most of the throttling issues. I think this should be added to the docs.

May I ask what size of org / deployments do you have?

We have one deployment with 21 runners.

Because of the resolution above, we decided to remove pools all together as they are expensive. Once we removed pools we noticed that every so often a job will never be allocated a runner. When looking into the logs, I see an error around the same time:

{"level":"WARN","message":"Ignoring error: Request failed with status code 503","service":"runners-scale-up","timestamp":"2024-07-22T18:32:03.979Z","xray_trace_id":"1-669ea597-fcd409efe9e843d7a70dc3d6" ... }

The job will now hang forever. I believe this happens because of the size of some of our workflow's matrix jobs. It launches around 25 jobs in parallel. So far, we have only noticed this error for this specific workflow.

The workaround for us is to always ensure there is one runner available at all times, so we have to add in a pool of size 1 to all our runners. Obviously this isn't ideal. I am not sure if I have anymore control on how philips will process my requests, turning down scale_up_reserved_concurrent_executions: 5 is very slow. In my opinion, the github client should wait and retry these errors a few times before giving up. Thoughts?

Also, see updated overview of this issue. You can see the original 403 error was from the pool lambda, since resolved, but new 503 is from the scale up lambda, which makes sense since that would be getting all the parallel requests from my job matrix.

andrewdibiasio6 · 2024-09-18T18:46:05Z

@npalm According to the github rate limiting error docs linked in the error message, if we keep retrying requests, we will be limited further, or our app will be banned. We are still seeing this issue and when it happens, we are limited for multiple hours and we can't request any runners.

kgoralski · 2024-09-20T10:54:01Z

Can maxReceiveCount: 100 contribute to higher number of requests to Github?
If yes, then having many fleet types and high maxReceiveCount can probably exceed the limit easily?

npalm · 2024-10-03T11:04:55Z

@andrewdibiasio6 the module is now supporting a job retry mechanism, which will solve teh problem for some hagning jobs

andrewdibiasio6 · 2024-10-09T15:49:07Z

@npalm Yes this would solve the issue for some hanging jobs, but 900s upper bound for retires isn't going to help. When throttled by github, you're usually throttled for 1 hour. This means no amount for retires will help. If anything, retrying more will likely throttle you more, as giuthub suggestion is to back off for a suggested amount of time before retrying, hence using the octo client.

npalm · 2024-10-09T18:09:25Z

The intend of the retry are mostly messages that are missed, messages getting crossed and not scaling properly. Indeed 900 is the max for SQS. Ideas or help is very welcom to make the runners. more resilient. But the tough part quering GitHub to find jobs will only add up to rate limit. Also GitHub does not have an API to ask the depth of queus.

stuartp44 added enhancement New feature or request question Further information is requested labels Aug 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Github Throttling pluging for @octokit/rest #3983

Use Github Throttling pluging for @octokit/rest #3983

andrewdibiasio6 commented Jul 11, 2024 •

edited

Loading

npalm commented Jul 12, 2024

andrewdibiasio6 commented Jul 23, 2024 •

edited

Loading

andrewdibiasio6 commented Sep 18, 2024

kgoralski commented Sep 20, 2024 •

edited

Loading

npalm commented Oct 3, 2024

andrewdibiasio6 commented Oct 9, 2024

npalm commented Oct 9, 2024

Use Github Throttling pluging for @octokit/rest #3983

Use Github Throttling pluging for @octokit/rest #3983

Comments

andrewdibiasio6 commented Jul 11, 2024 • edited Loading

npalm commented Jul 12, 2024

andrewdibiasio6 commented Jul 23, 2024 • edited Loading

andrewdibiasio6 commented Sep 18, 2024

kgoralski commented Sep 20, 2024 • edited Loading

npalm commented Oct 3, 2024

andrewdibiasio6 commented Oct 9, 2024

npalm commented Oct 9, 2024

andrewdibiasio6 commented Jul 11, 2024 •

edited

Loading

andrewdibiasio6 commented Jul 23, 2024 •

edited

Loading

kgoralski commented Sep 20, 2024 •

edited

Loading