Advice on how to deal with AI bots/scrapers?

zoey@lemmy.librebun.com · 19 days ago

Advice on how to deal with AI bots/scrapers?

CronyAkatsuki@lemmy.cronyakatsuki.xyz · edit-2 18 days ago

And the comminity blocklists are updated when more than a couple ( I think the number is something like 10-50 ) instances of crowdsec block an ip in some fast timeframe.

The ai blocklist just adds IP when even one instance finds an AI trying to scrape right from the useragent.

So even if the community blocklist has fewer ai ip’s, it does eventually include them.

Starfarer@lemmy.today · 17 days ago

Which Crowd-Sec blocklists are you using?

CronyAkatsuki@lemmy.cronyakatsuki.xyz · 17 days ago

I’m using the default list alongside Firehol BotScout list and Firehol cybercrime tracker list set to ban.

Also using the Firehol cruzit.com list set to do captcha, just in case it’s not actually a bot.

I’m also using the cs-firewall-bouncer and a custom bouncer that’s shown on crowdsecs tutorials to detect privilege escalation for if anybody actually manages to get inside.

Alongside that I’m using a lot of scenario collection’s for specific software I’m using like nextcloud, grafana, ssh, … which helps a lot with attacks directly done on a service and not just general scraping or both path traversing.

All free and have been using it for a year, only complaint I have is that I had to make a cronjob to restart the crowdsec service every day because it would stop working after a couple days because of the amount of requests it has to process.