That’s pretty sweet but just be aware a lot of bots are bad actors and don’t advertise a proper user agent, so you have to also block by ips. Blocking all alibaba server ips is a good start.
This is an nginx reverse proxy configuration. It’s not passive, like robots.txt, but they probably named it like thatin solidarity with the intent of robots.txt. You’re on-point about Alibaba though, which I’m sure could be somewhat easily added to this nginx blocking strategy. Anubis is still probably a better solution, since it doesn’t have that limitation of having LLM bots pass a user-agent.
That’s pretty sweet but just be aware a lot of bots are bad actors and don’t advertise a proper user agent, so you have to also block by ips. Blocking all alibaba server ips is a good start.
This is an nginx reverse proxy configuration. It’s not passive, like robots.txt, but they probably named it like thatin solidarity with the intent of robots.txt. You’re on-point about Alibaba though, which I’m sure could be somewhat easily added to this nginx blocking strategy. Anubis is still probably a better solution, since it doesn’t have that limitation of having LLM bots pass a user-agent.