Look Ma No HTTP_proxy

alexellisuk1 pts0 comments

Look ma! No HTTP_PROXY! - SlicerVM Blog<br>svg]:px-2.5 font-mono -ml-2" href="/blog/">Back to Blog

Proxies are inevitable when it comes to filtering egress traffic and credential injection, but can we make the configuration go away?

I first learned about Squid many years ago as a lad at school. It turned out that the IT administrators had set up Squid to block various sites deemed unproductive, or out of bounds for education.

The thing was, I found a bypass, and I shared it with my fellow pupils, and we had full - unfettered Internet access, just like an AI agent gets when it's running on your host computer. Over time, the teachers found out - and eventually called me with some others into the assembly to reprimand us.

From that day on, we couldn't bypass the proxy by going into Netscape Navigator and blanking out the HTTP_PROXY field. And I think they must have learned about transparent proxying on that day.

Two ways to HTTP_PROXY

We have two options to work with a proxy within a VM. The first is to use environment variables - which is fine for proxy-aware programs, and automation. The second is to implement a transparent proxy.

Environment variables and proxy-aware programs

For Slicer, we recommend blocking all network access for restricted VMs. Then, adding back in the proxy's IP with the two ports: 3128 and 3129. With that approach, you can't have some teenage Alex Ellis come up and bypass your filtering policies.

Your applications then need to be proxy aware - and most are these days, but there are exceptions.

By proxy aware, this means reading and respecting the HTTP_PROXY, HTTPS_PROXY, and NO_PROXY environment variables. Any plaintext endpoint i.e. http:// uses HTTP_PROXY and https:// uses HTTPS_PROXY. Finally, NO_PROXY is key for local IP addresses, things like a Docker container that's not reachable from the proxy itself.

Things like curl will generally just pick it up.

export HTTP_PROXY=http://192.168.122.1:3128<br>curl -i http://wikipedia.org<br>Curl also has a specific --proxy flag (-x for short):

curl -x http://192.168.122.1:3128 -i http://wikipedia.org<br>And things can get more complicated than this. For instance, when using HTTPS_PROXY if the proxy itself uses a cert the VM does not trust, then you need to use --proxy-insecure (or better --cacert or --capath to point at the proper bundle).

Using a custom CA is not hard, but it does involve management, rotation, distribution - all of which present their own challenges. A certificate or bundle is generally added to /etc/ssl/certs and the user runs sudo update-ca-certificates, and the system will more or less trust it. Sometimes, Node.js can be picky, so you may want to add the NODE_EXTRA_CA_CERTS environment variable.

Fortunately, Slicer's CA infrastructure sets up, maintains and injects CA bundles into VMs (when enabled) meaning, the proxy's cert, and any leafs it mints for different sites are deemed valid.

Where things get slippery is with Docker and Kubernetes. With containers, you not only have a daemon that may need a proxy setting up to be able to pull from registries, but you have containerised workloads with their own root filesystems and trust bundles. On Kubernetes, you can inject a CA as a ConfigMount or Secret, and point your application at it. We do this for OpenFaaS and it's fine. But during builds things get even harder - you run a step like RUN apt install nginx and now the layer you're working within also needs to trust your CA.

There is no clear or clean solution for this.

Waving goodbye to HTTP_PROXY

Environment variables are great for bots, functions, relatively static and repetitive tasks. But when a human or agent is involved, we're more likely to trip over them and find edge-cases.

So how can we wave goodbye to the HTTP_PROXY and HTTPS_PROXY environment variables?

We go back to my alma mater, and redirect all outbound traffic to the proxy at the system level. Now they likely did this via routing tables, blocking all direct access as belt and braces.

I recently learned of a very well known CI provider claiming to implement full traffic blocking for GitHub Actions. The best part? They set up the rules in the guest - and we all know CI is useless without root access to install packages and perform privileged operations.

Why does this vendor make me smile? Well, it's just like that teenage me removing the proxy setting from the browser.

All any script, malware, or vengeful employee needs to do is to run sudo iptables -F and your "secure CI runner", now has no filtering whatsoever.

Really, these rules have to be enforced on the host side, so the workload cannot simply say "nope." So it may surprise you to see us doing something similar with Slicer, but with a twist.

Because we recommend users explicitly drop all traffic, there is nothing you can do but use the proxy to egress. Whether via an environment variable, iptables rules that redirect traffic to it.

In userdata, you can just...

proxy http_proxy environment traffic like variables

Related Articles