No query strings here either from

speckx1 pts0 comments

No query strings here either from ~timo

A couple of weeks ago, Chris Morgan published

I've banned query strings.<br>I read it, liked it and then did roughly the same<br>thing on my own site - with two deliberate differences.

Chris's opening sums up the motivation better than I could:

I don't like people adding tracking stuff to URLs. Still less<br>do I like people adding tracking stuff to my URLs.

[...] UTM parameters are for me to use, not you.<br>Leave my URLs alone.

From chrismorgan.info/no-query-strings

The premise is the same here. A ?utm_source=... or<br>?ref=... tacked onto one of my URLs by some intermediary<br>is, at best, noise I never asked for and at worst a tracker the<br>referrer is using to nudge my visitor's behaviour into a funnel.<br>I'd rather refuse to serve those requests than pretend they're a<br>legitimate way to reach a page on my site.<br>I'm also a fan of having one true canonical<br>URL to all my pages.

Where I differ from Chris

1. cache-busters like ?v= are allowed

Chris went for a true blanket ban - including breaking old<br>cache-busting URLs like ?t=... and ?h=...<br>that his site used to serve.<br>That's likely the right call when none of those<br>URLs are still in circulation. In any case, those might only be used<br>for static assets anyways, where people don't have bookmarks to.

My situation is slightly different: I actively use<br>?v= as a cache buster on assets I serve<br>today. The very HTML you're reading links to<br>main.css?v=1. I use it so that I can set a very high<br>Cache-Control: max-age: ... on static assets.<br>If I matched Chris's strictness I'd<br>have to either give up on query-string cache busting (and switch<br>to fingerprinted filenames or Cache-Control juggling),<br>or break my own page load on every bump.

So my rule is a narrow allowlist:<br>everything is blocked, except ?v= .<br>The matcher is intentionally strict - the whole query string must<br>be exactly v= followed by digits, nothing else, no<br>extra parameters smuggled in alongside.

(no_query_strings) {<br>@bad_query `{http.request.orig_uri}.contains("?") && !{http.request.uri.query}.matches("^v=[0-9]+$")`<br>error @bad_query 403

The first clause uses orig_uri so a bare trailing<br>? still trips the ban - Caddy's {query}<br>placeholder can't distinguish "absent" from "empty", and a lone<br>? deserves the same treatment as a parameter list.<br>The second clause uses the canonical {uri.query}<br>because Caddy doesn't expose .query as a sub-key on<br>orig_uri - the rewrite never touches the query so the<br>two are equivalent here.

2. 403 Forbidden, not 414 URI Too Long

Chris picked<br>414 URI Too Long,<br>and is upfront about it:

You could argue that I'm abusing 414 URI Too Long. I respond<br>that it's funnier this way.

From Chris's ban page

It's indeed nice, but I wanted to pick a status code that I can<br>defend on RFC grounds rather than vibes. Here's how I read<br>RFC 9110<br>and<br>RFC 7725<br>for this case:

400 Bad Request

The<br>server cannot or will not process the request due to something<br>that is perceived to be a client error (e.g., malformed request<br>syntax). The request isn't malformed; ?utm_source=x<br>is perfectly well-formed. Too generic.

403 Forbidden

The<br>server understood the request but refuses to authorize it.<br>That is exactly what's happening: I understood the request, the<br>URL would otherwise resolve, and I am refusing on policy grounds.<br>The spec also explicitly notes that a server<br>can<br>describe that reason in the response content, which is what<br>the body of the 403 page does.

404 Not Found

Misleading. The resource exists; I just won't serve it via<br>this URL. Also has unpleasant SEO and caching side effects.

414 URI Too Long

A refusal to service the request because the request-target<br>is<br>longer than the server is willing to interpret. The<br>objection is about length, not policy or content.<br>Although, I agree that one could argue that everything after<br>the canonical URL is too long.

451 Unavailable For Legal Reasons

Not legal reasons. Just personal taste.

403 is the cleanest semantic match. I'm not 100% certain, but<br>I'd interpret the "authorize" from RFC9110 as not only HTTP authentication<br>or authorization, but rather the more general sense of "permit".

(Okay, Chris is right that 414 is funnier, though. I'll concede that<br>one.)

What it looks like

When a request comes in with anything other than ?v=,<br>Caddy short-circuits with a 403 and serves a small explainer page.<br>You can try it yourself:<br>furrer.life/~timo/?utm_source=this-post.<br>The page tells you what happened, why, and offers the same URL<br>without the query string.

If you want to follow Chris down this path, his post links the<br>relevant Caddyfile snippet on his site or<br>use mine above which is a small variant of that with<br>the extra allowlist clause shown above.

Webmentions

Replies (1)

query request chris from urls page

Related Articles