Every JavaScript bundler handles inline script tags wrong

csande171 pts0 comments

Every JavaScript bundler handles inline script tags wrong

Carter Sande<br>2026-07-05<br>Every JavaScript bundler handles inline tags wrong

There are two main ways to include JavaScript code on a web page. You can use a tag with a src attribute pointing to a JavaScript file:

https://mywebsite.example/script.js"><br>Or you can write your JavaScript code directly inside the tag, which is sometimes called inlining it:

console.log("Hello!");

(You can also write JavaScript inline in event handler attributes like onclick, and twenty years ago browsers used to have ways to embed JavaScript in CSS, but anyway this post is about tags.)

Traditionally, you&rsquo;d use the src attribute for long scripts that were reused on multiple web pages (so the browser would only need to download the .js file once), and you&rsquo;d use an inline tag for short scripts that were specific to one page (so the browser wouldn&rsquo;t need to make a separate HTTP request to download them).<br>But there&rsquo;s a growing trend of using inline tags for every script on your page.<br>Game creation tools like Twine have been doing that for a long time, so that you can download the whole game as a single HTML file.<br>A couple years ago, the SvelteKit web framework came out with a &ldquo;self-contained apps&rdquo; mode to do the same thing.<br>And more recently, some AI websites have started encouraging people to create and share &ldquo;HTML artifacts&rdquo; that include all the JavaScript as part of the HTML file.

There&rsquo;s a funny pitfall that can happen if you take a bunch of existing JavaScript code and put it into an inline tag: what happens if that JavaScript code contains the string &lt/script>?

console.log("");

The answer is that the browser&rsquo;s HTML parser doesn&rsquo;t care about JavaScript syntax at all, so as soon as it sees the characters , it ends the script.<br>This usually means that the script gets an &ldquo;unterminated string literal&rdquo; syntax error, and then everything that comes afterwards gets inserted into the page as text.

This is pretty well-known as a way to do cross-site scripting attacks against React apps, but there&rsquo;s an even funnier variant that the HTML specification calls the script data double escaped state:

console.log(""");

If a script contains the string , and then at some point later contains the string , then the next time the browser sees a , it won&rsquo;t end the script.<br>(So whatever appears after the inline tag will become part of the script, probably causing a syntax error but maybe creating a cross-site scripting vulnerability.)<br>This is just, like, a weird loophole they added to the HTML specification to improve compatibility with twenty-year-old websites.<br>Hilarious!

You can also combine other JavaScript features to create these sequences of characters outside of a string literal. A less-than sign plus a regular expression literal makes :

if (1/.exec(someString).index) { /* ... */ }

Just like with string literals, as soon as the browser&rsquo;s HTML parser sees those characters next to each other, it&rsquo;ll immediately end the script, and the /.exec(... part will get dumped into the page as text.

You can similarly combine less-than signs, greater-than signs, the &ldquo;not&rdquo; operator, and the prefix decrement operator to create stupid but technically valid JavaScript code that includes both of the ingredients for the script data double escaped state:

var x = 5;<br>var script = 14) { /* ... */ }

Well, actually, there&rsquo;s another weird loophole that affects this code: Annex B.1.1 of the ECMAScript specification says that JavaScript interpreters inside of browsers should treat the string as the start of a line comment, similar to // comments, unless the code is inside a JavaScript module.<br>JavaScript interpreters not inside of web browsers get to choose whether or not they do this, so technically this program has two different valid meanings under the ECMAScript specification.<br>(Different syntax highlighting libraries disagree about how to handle this code, too. How does your favorite text editor display it?)

Anyway, it&rsquo;s obviously pretty rare for people to hand-write weird code like this in an inline tag.<br>But it&rsquo;s fairly common for people to use &ldquo;bundlers&rdquo; or &ldquo;build tools&rdquo; that promise to take all the random junk they got from NPM or ChatGPT or wherever, combine it into a single script, and &ldquo;minify&rdquo; it to make it as small as possible.<br>Those tools should definitely avoid causing these kinds of syntax problems, right?<br>How well do they do?

How build tools should handle this stuff

You could imagine a few basic rules that build tools should follow to avoid generating code that has these issues. Things like:

If a regular expression literal starting with script> appears right after a less-than sign, add a space in between them.

If the program contains a less-than sign, followed by a not operator, followed by a prefix decrement operator,...

javascript script rsquo code inline html

Related Articles