:: Re: [devuan-mirrors] [DNG] Artifici…
Top Page
Delete this message
Reply to this message
Author: onefang
Date:  
To: dng
CC: devuan-mirrors
Old-Topics: [DNG] Artificial Idiocy attacking mirrors.
Subject: Re: [devuan-mirrors] [DNG] Artificial Idiocy attacking mirrors.
On 2025-11-27 15:02:35, onefang wrote:
> ClaudeBot is currently doing something really stupid. It somehow has
> come to the conclusion that my Devuan mirror mirrors EVERYTHING!
>
> So it's probing my mirror looking for things like rpms, jars, npn, cpan
> packages, etc, and specifically looking in directories named after other
> operating systems or package collections. Even stuff I have never heard
> of. I certainly don't mirror any of it, only Devuan.
>
> And it's doing it stupidly. It obviously knows what to expect, since it
> keeps asking me for the actual packages and metadata files that would be
> there if I was mirroring that particular thing. For example failing to
> download several rpm files from a directory that is named after some rpm
> based distro. After the first failure, it should figure out I'm not
> mirroring that, give up. On top of that, these are mirrors it is
> probing, the word "mirror" is even in some of the paths, they should all
> be identical. If it already knows what to expect, it doesn't need to
> download it yet again.
>
> This may be going on with other mirrors, though some of them do mirror
> other things as well. All it's doing is wasting bandwidth and space for
> log files. Slowing everything down. And clogging up the AI with lots of
> duplicates of binary files.


I have a theory.

I'm only seeing this nonsense for deb.devuan.org, not sledjhamr.org.

AI bots are scanning deb.devuan.org, and the country code versions, no
idea how they handle a DNS-RR. One or more of our package mirrors is
also hosting mirrors of Fedora, SUSE, and the dozen other mirrors my
server doesn't host that the AI bots are scanning for. These other
Devuan mirrors have those non Devuan things available from deb.devuan.org
I guess, which is how the bots figured all that non Devuan stuff is
at deb.devuan.org

So the AIs just think that all of us Devuan packge mirrors have all of
the things that the other package mirrors have. And end up scanning my
server for non existent directories thousands of times per day, looking
for identical files. That's how the bots know what to scan for, they
found it at deb.devuan.org before.

I wonder if other Devuan mirrors are seeing this? We should block off
Fedora / SUSE / whatever mirrors and things from deb.devuan.org. Might
still take ages for the AI bots to forget to look for these things.