The S in Interoperability

conslit1 pts0 comments

Frederik Braun: The S in interoperability

F'>

This is a blog post about standards, their proliferation and the issues<br>that may arise.<br>My first involvement with standards was just as a reader. To<br>better understand complicated code or unexpected behavior in a protocol.<br>After a while, I also got involved and helped clarify certain things to ensure<br>implementations align on the same behavior in edge cases.<br>Eventually, I found myself co-editing a specification -<br>Subresource Integrity (SRI) which was published as<br>a W3C Recommendation in 2015. The core idea behind SRI is that you include<br>third-party JavaScript combined with a SHA2 digest of the expected file.<br>If the browser does not find the downloaded URL to match the expected digest,<br>the script will not execute. This allows using a fast CDN for JavaScript<br>without giving them full control over the scripts on your page - essentially<br>reducing the security risks.

The standard format for these digests is e.g.,<br>sha(size)-(base64 encoding of the digest).<br>While computing the hash digest is rather straightforward, base64 comes in two<br>encoding alphabets: First, a-zA-Z0-9/+ and secondly the url-safe variant<br>which uses a-zA-z0-9_-. The specification examples all used the former.

Only approximately ten years after publication, in 2025, we still found a bug.<br>As part of a compatibility report against Firefox not properly supporting a<br>website, we found that the core issue was actually with a different browser.<br>The other browser liberally accepted both types of encoding, which resulted in<br>websites expecting support for base64 and base64url interchangeably.<br>The page did not work in Firefox, because it did not accept all hashes a<br>website wanted the browser to check, revealing a minor security issue.

The real fix would have been that the standard clarifies that<br>the base64url variant is incorrect and the other browser engine changes<br>their behavior.

But due to (somewhat unrelated) issues around proliferation of standards, web<br>compatibility and the unfortunate market dominance of certain browsers, we<br>went the other road. To support existing web content, we changed the standard<br>to acknowledging that both types of encoding are considered valid<br>representations.

This example shows, that it can take multiple years for subtle differences to<br>appear. Interoperable specifications can establish a shared<br>understanding along a "happy path", but not necessarily in adversarial<br>settings. In addition, standards need to continuous maintenance and active<br>stakeholders who ensure that implementations remain interoperable and secure<br>over time.

From specification to standard

Originally, a specification is at first just a write-up, an idea how something<br>could be better:<br>How it should behave, how it works, what the data structures, the algorithms<br>and the interactions of them look like. Anyone can come up with a grammar,<br>a parser and a resulting data structure.

For a standard, this specification needs a shared agreement that is also<br>widely and consistently implemented. This will work best with iterative<br>co-design of the spec, the implementations and intense discussions of<br>corner cases.<br>Some may go further and use shared test suites.

This will lead to Interoperability (interop), but still<br>requires constant maintenance and observation of the ecosystem beyond<br>individual implementations. While interop is asymptotic and requires a shared<br>agreement over time, security demands understanding - a broader reach that<br>requires the inspection of limitations and subtle boundaries.

This deeper level of understanding is often missing when implementations<br>consider syntax "simple enough" without reading the spec. The base64 SRI example is just one example, but there are more:

Many people have written their own parsers for text-based<br>languages. You may have seen code that parses HTML with regular expressions.<br>Other great examples of "easily" parsed languages are maybe XML, JSON, or YAML.

But these implementations often make different assumptions, leading to subtle incompatibilities or even security flaws.

Parser Differentials

More practical, let's look at an issue with JSON, to demonstrate the impact of<br>handling input that is ostensibly simple.<br>Let's examine this JSON string and the resulting data structure:

"test": 0,<br>"test": 1

When parsed into an object obj, what do you think will obj.test return?<br>Most JSON parsers are so liberal that they will happily consume two dictionary<br>keys with the same name "test". One implementation may simply assign obj.test<br>twice: First with 0 and then overwrite it with 1.<br>Another one might check for existing keys<br>and reject the second "test" key silently, keeping the first one.

The lack of rigor in the original description of JSON as a<br>"subset of JavaScript" was already acknowledged and raised as problematic<br>in the JSON RFC (which came much later in 2017).<br>But still to this day, may implementations allow input<br>with duplicate dictionary keys and show divergent behavior.

While...

implementations test json first specification browser

Related Articles