Blocking an ASN (or similar) from my sites

cdrnsf1 pts0 comments

Blocking an ASN (or similar) from my sites - Matthew Somerville

Blocking an ASN (or similar) from my sites<br>24th May 2026

I run a<br>number<br>of<br>websites,<br>some of which could even be said to be popular.<br>I want humans to visit these websites, enjoy these websites,<br>make their change of trains at New Street in good time,<br>investigate<br>miscarriages of justice,<br>or find out the play they watched in London when hitch-hiking round Europe in the &rsquo;80s.

But I find it harder to do this when my server is swamped with<br>artificial traffic from bots, AI clankers, and whatever else nonsense there is nowadays<br>(Weird Gloop has a good summary; I<br>don&rsquo;t have wikis but for other reasons am basically in the same boat).<br>This is especially egregious when they are incredibly poorly written and constantly<br>fetch basically identical pages which may cause issues not just on my own server but<br>with upstream sources.<br>Clearly the people behind all this simply do not think or care (and as an aside, this is<br>why I cannot extricate positives from this technology without the negatives<br>alongside, and how it has been/is being introduced).

Some you can block by individual IP address, some by user agent, some by location, and sometimes<br>(the purpose of this post) you just feel like every hit you get from an entire company<br>is artificial traffic ultimately derived from selfish individuals;<br>reporting the abuse won&rsquo;t do anything, so you just want to block any traffic<br>from that company. (This doesn&rsquo;t help with residential proxies, of course, but<br>every little helps.)

So, how do I block any Amazon, or Tencent, or DigitalOcean, or ..., IP address from accessing my site?

Getting a list

Cloud providers

Google and Amazon publish JSON of their current cloud IP ranges.<br>So for getting a list of ranges from Google you can use:

curl -s -O https://www.gstatic.com/ipranges/cloud.json<br>jq -r '.prefixes | .[] | (.ipv4Prefix // .ipv6Prefix)' cloud.json

Or for Amazon:

curl -s -O https://ip-ranges.amazonaws.com/ip-ranges.json<br>jq -r '.prefixes | .[].ip_prefix' ip-ranges.json<br>jq -r '.ipv6_prefixes | .[].ipv6_prefix' ip-ranges.json

ASN lists

Other providers don&rsquo;t publish such lists, but they do have to tell the internet which IP addresses they<br>are responsible for and that they provide routing for them. This is done using<br>Autonomous System Numbers,<br>which are used in BGP (Border Gateway Protocol) routing.

A routing registry, such as RADb, lets you look up all the routes<br>given an ASN. So once you have discovered that, I dunno, AS136907 is Huawei, you can ask RADb for all the ranges:

whois -h whois.radb.net -- "-i origin AS136907" | grep 'route:' | cut -d' ' -f 11

Checking it twice

Now you have some lists, you can then add these to your firewall however you<br>wish. In my case, I use iptables, and stick most in a total block list, and<br>some in an incoming drop list (so I can still make outgoing connections).

Having done the above, though, reloading my firewall was now pretty slow; fine,<br>but annoying if I wanted to quickly block something else. A couple of ASNs had<br>an awful lot of IP ranges in them, and I wondered if I could consolidate these<br>at all.

Some searching found me two consolidators,<br>one in Python and<br>one in Rust.<br>Both cut down my list of ranges to block substantially, which was great;<br>the Python one was very slow and heavy on memory, and<br>the Rust one was very quick but I didn&rsquo;t really want to install rust etc on my<br>server.

Yak shaving into cross-compilation

I had heard rust could cross-compile binaries on one platform to run on another,<br>which seemed ideal – some more searching found me<br>these helpful instructions<br>which worked perfectly for me (and boil down to 1. add x86_64-unknown-linux-gnu target; 2. install provided linker; 3. build).

This gave me a cid-aggregator binary that I could pipe my IP ranges to,<br>before adding to my firewall. 30,000 ranges reduced to 730 odd in the end.

Finding out who&rsquo;s naughty or nice

Having such wide blocks isn&rsquo;t without its own issues, even if it has cut quite a bit of bot traffic.<br>The two main problems I have had since are:

Let&rsquo;s Encrypt, which provides the SSL certificates for my domains, uses random<br>IP addresses at multiple providers (which it won&rsquo;t reveal) to perform HTTP validation of domains. I<br>have at least once blocked my own renewals due to this; I&rsquo;ll try and switch to DNS validation<br>at some point, but in the meantime I can temporarily drop the blocks while the<br>renewals take place;<br>Bluesky similarly uses Amazon servers for checking for custom domain<br>handles, so they occasionally break, with a similar solution or switching to DNS<br>verification at some point in future.

Reading

If you&rsquo;ve made it this far, I have just finished reading<br>The God Engines by John Scalzi, am currently reading The Tiger and the Wolf by Adrian Tchaikovsky,<br>and have recently bought Soviet Scientific Institutes by Eric Lusito,<br>and a book of time travel romance stories called...

rsquo ranges from block json list

Related Articles