Skip to main content
Topic solved
This topic has been marked as solved and requires no further attention.
Topic: Any gitea api rate limits? (Read 1591 times) previous topic - next topic
0 Members and 2 Guests are viewing this topic.

Any gitea api rate limits?

Hi all!

As an ARMtix developer, i have to periodically synchronize my local copies of PKGBUILDs repos with upstream gitea ones. However, since old monorepos are now gone, i have to pull each repo individually and it seems my build system sometimes fails to some of the repos. The logic behind this automatization only pulls/clones those repos which have updated_at field later than timestamp of the last sync. As such, if some repo failed to pull, it can only be pulled again when it is updated.
At this moment, i sometimes run manually a script which gets all repos updated_at field and compares with timestamp of the latest local commit. After that i pull repos which were marked as outdated by this script. I do it in parallel with 10 repos per iteration but at some iteration pull fails with a message like:
Code: [Select]
fatal: unable to access 'https://gitea.artixlinux.org/packages/kwallet.git/': Failed to connect to gitea.artixlinux.org port 443 after 130269 ms: Couldn't connect to server
What is current rate limit for git pull? Also, could there be any finer ways to make such a sync?

Additionally, it seems that updated_at is also changed with some other actions except commits, since i have a lot of repos marked as outdated by git pull reports that they are up-to-date.
ARMtix

Re: Any gitea api rate limits?

Reply #1
I resurrect this topic since from yesterday i mostly get timeouts from gitea http/https (though ssh works). Isn't there any firewall at gitea side which could block me if make too much requests?
ARMtix

Re: Any gitea api rate limits?

Reply #2
Yeah, fail2ban is running rampant right now. nous is dialing it back little by little.


Re: Any gitea api rate limits?

Reply #4
Yeah, fail2ban is running rampant right now. nous is dialing it back little by little.
Could someone put a note here once it's done? ARMtix build is stuck since i rely on http api to query repos + http git clone
If you're on a static (or not-frequently-changing) IP, please send it in PM or better my email. I need to check the logs and see how often it triggers the filter.

Re: Any gitea api rate limits?

Reply #5
Before I start I'll just say I have no right to anything and maybe my use case is part of what you want to block ?

For a long time now (over a year) I've been maintaining a github repository of all the stable Artix package build files and another for the stable Arch package build files
https://github.com/gripped/artixpkgbuilds
https://github.com/gripped/archpkgbuilds

The purpose of this is I have another script which compares all the Artix PKGBUILD's with the Arch versions and creates a local repo based on the results. This is my idea of fun! (I should get out more)

It seems that for the time being, or maybe for good, this is unfeasible.
I was already rate limiting my script to one package processed every 1 seconds (3 for Arch because of their rate limiting) but have extended this to 5 seconds, which at a rough calculation would take 27 hours to process all the packages,
Each package processed involves a git pull, or a git clone for new packages

Even with the 5 second delay I get blocked fairly soon.

When I get blocked my whole ISP gets blocked. I can get a new IP fairly easily by disabling then enabling WAN on the router.
The ISP is huge. Even when the first octet is new I remain blocked for some time. eg last ip adresss was 86.x.x.x . Current is 109.x.x.x .
But the block remains. So that's why it appears the entire ISP is blocked somehow ?

Again if I can't do what I've been doing any more so be it. But I'm just letting you know in the hope that this is a case of unforeseen consequences ?

As an aside one way to make all the build files available to scripts, but have no load on your server would be to mirror the lot on Github.
A la https://github.com/archlinux/aur (Each individual AUR package repo is a branch on the actual repo)
As another aside the above repo is a way you can still get the the build files for packages which have been deleted from the AUR


Re: Any gitea api rate limits?

Reply #7
No joy I'm sorry to say.

I just started the script again for the first time since my last post.
It was worse.
Two packages processed and then blocked.

Log
Spoiler (click to show/hide)
The number is the delay time to keep it at 1 request every 5 seconds.

On the bright side it is now just a IP based block. Or I got lucky when I changed  my IP address ?

No worries on my part.



 

Re: Any gitea api rate limits?

Reply #9
Still not working.

Blocked after three packages.

It's git http access that's affected at the least.

I tried a couple of things.

1. Altered the script to iterate the requests though a couple of my remote servers, and my home ip, with proxychains.
All blocked in short order

2. Adapted a function from one of my other Python scripts to hammer a few of the Gitea package pages with curl.
Spoiler (click to show/hide)
I didn't run it for that long but it was clear that the server would happily process many requests per second via curl.
But it blocks very quickly when doing just one git pull every 5 seconds.

I have never used fail2ban so I don't know it's intricacies.

Just feedback again. It doesn't matter much to me but might more to @phoenix_king_rus  ?



Re: Any gitea api rate limits?

Reply #11
Two filter files, one for GET reqs and one for POST:
Code: [Select]
# fail2ban filter configuration for nginx GET denied accesses
[INCLUDES]
before = nginx-error-common.conf
[Definition]
failregex = ^.*\[crit\].* Permission denied\), client: <HOST>.*request: "GET
ignoreregex = .*/public/api.*
              .*/artix/.*
              .*/packages/.*
Code: [Select]
# fail2ban filter configuration for nginx POST denied accesses
[INCLUDES]
before = nginx-error-common.conf
[Definition]
failregex = ^.*\[crit\].* Permission denied\), client: <HOST>.*request: "POST
ignoreregex = .*/artix/.*

Limits set in /etc/fail2ban/jail.local:
Code: [Select]
[nginx-denied-get]
findtime  = 1m
maxretry = 120
port     = http,https
logpath  = %(nginx_error_log)s
enabled = true

[nginx-denied-post]
findtime  = 1m
maxretry = 90
port     = http,https
logpath  = %(nginx_error_log)s
enabled = true

120 failed GETs/min is a lot, it befuddles me how you still earn bans.

Re: Any gitea api rate limits?

Reply #12
Oh, grepping for Ban, there was a forgotten nginx-denied filter in the default rules file (jail.conf) which banned lots of IPs. Let's see how it goes without it.

Re: Any gitea api rate limits?

Reply #13
So far so good.

I was probably overly hopeful at first. I changed the script back to 1 git pull per second. (I have no idea how many actual https requests one git pull results in)

It got further. I'd say about 20-30 packages then blocked :(  (I meant to count but was trigger happy closing the terminal)

If it's any help to you 86.147.37.212 was the IP address on that attempt.

I then changed it to 5 seconds per git pull again and all of 'system' is processed.
I'll be back in about 26 hours to let you know if it finished. Fairly confident.

If it does I'll experiment with lowering the delay to get an idea of a reliable rate limit from the user end.

Thanks.