Show HN: Sanitising Email

mike-cardwell1 pts0 comments

Sanitising Email - grepular.com

Skip to contentrepular.com🇬🇧🇺🇦

MenuRecent likes from<br>🇳🇱🇧🇷🇺🇸🇹🇼🇬🇧

For the past 15 years or so, I've been using a simple Perl script that I wrote called gpgit to encrypt email stored on my mail server, both incoming and outgoing. It just takes a raw email on stdin and writes the modified email to stdout. It has always been in the back of my mind that I could do a lot more than just encrypting an email, from a privacy and security perspective, but I never felt I had the time to do the project justice. That is, until LLMs came onto the scene.

I have just made available an open source project called Sanimail which I have been using on my own email for a little while now. It does what gpgit does, plus a lot more:

Inlining remote content

Policy based sanitising of html/css/svg parts

Removal of tracking params in links

Disarming of privacy invading headers

PGP and S/MIME encryption/decryption/signing

Other noteworthy options

Hardening

Deployment

Usage tips

Project status

Inlining remote content #

One of the long standing issues with email has been pixel tracking via remote images referenced in email HTML parts:

img src="https://example.com/pixel.png?emailId=U21hcnQgYXkCg"><br>You view an email, the image is fetched, the sender can now know that you read the email, when, and what your IP was at the time. Some of the larger email providers have started addressing this problem by replacing these links with links to their own proxies:

img src="https://proxy.example.net/?url=https%3A%2F%2Fexample.com%2Fpixel.png%3FemailId%3DU21hcnQgYXkCg"><br>So when you view the email, the sender only sees the proxy's IP, not yours. Some even claim to fetch remote content and cache it as soon as the email is delivered (Apple lies about doing this). So that the sender doesn't even know if you actually read the message, let alone when, or from where.

I host my own email and I wanted this functionality for myself, so I added it to Sanimail. Technically, my solution is better because the download is permanent. The proxy solutions created by the big mail providers will expire content from their cache, causing you to re-fetch it if you look at an older email.

$ sanimail --remote-inline out.eml<br>This searches HTML, CSS and SVGs in the email, to find URLs that would be fetched, fetches them, attaches them to the email, and then updates the link to refer to the attachment instead of the remote URL. There are a whole bunch of limits, timeouts and image optimisations, to make this work well, with corresponding command line options. You can even proxy through Tor if you want to confuse the sender some more --remote-fetch-proxy socks5h://127.0.0.1:9050

--remote-inline Fetch remote images and @font-face fonts and attach them inline (cid:); needs network

--remote-fetch-proxy string Route remote fetches through a SOCKS5/HTTP proxy with remote DNS (works with Tor/.onion)<br>--remote-fetch-proxy-password string Password for --remote-fetch-proxy authentication (visible in argv; prefer --remote-fetch-proxy-password-file)<br>--remote-fetch-proxy-password-file string Read the --remote-fetch-proxy password from this file (trailing newline trimmed)<br>--remote-fetch-proxy-user string Username for --remote-fetch-proxy authentication

--remote-img-deanimate Flatten a fetched animated GIF/APNG to its resting frame<br>--remote-img-deanimate-cap duration Wall-time cap on de-animating one image; on expiry keep the frame composited so far (0 = unlimited) (default 500ms)<br>--remote-img-jpeg-quality int Recompress fetched JPEGs at quality 1-100 (0 = off; re-encodes forced by other flags use 80)<br>--remote-img-max-height int Downscale fetched raster images taller than N px (0 = no limit)<br>--remote-img-max-ram size Max approx peak RAM per decoded image (0 = unlimited; e.g. 384MiB) (default 402653184)<br>--remote-img-max-width int Downscale fetched raster images wider than N px (0 = no limit)<br>--remote-img-optimise Convert fetched static images to JPEG, or losslessly recompress PNG, when smaller

--remote-item-max-bytes size Per-fetch byte cap (0 = unlimited; e.g. 8MiB) (default 8388608)<br>--remote-max-bytes size Total remote-fetch byte budget per message (0 = unlimited; e.g. 16MiB) (default 16777216)<br>--remote-max-count int Max distinct remote URLs fetched per message (0 = unlimited) (default 42)<br>--remote-max-parallel int Max concurrent remote fetches per message (0 = unlimited) (default 16)<br>--remote-max-parallel-per-host int Max concurrent remote fetches to one host (0 = unlimited) (default 6)<br>--remote-timeout duration Per-fetch timeout for remote fetches (default 15s)<br>--remote-total-timeout duration Aggregate remote-fetch budget per message (0 = unlimited) (default 45s)

--remote-neutralize-failures string Neutralize failed-fetch image URLs: gone (404/410), permanent (+4xx/unusable, default), all (+403/transient) (default "permanent")<br>--remote-user-agent string User-Agent sent on remote fetches<br>This of course makes emails larger as...

remote email fetch proxy default fetched

Related Articles