YAML? That's Norway Problem

theanonymousone1 pts0 comments

YAML? That’s Norway problem

What is yaml

Yaml is a well-known data<br>serialization language designed for human readability. It’s a popular<br>choice for configuration files and metadata. Here’s a simple<br>example:

# project.yaml

title: Nonoverse<br>description: Beautiful puzzle game about nonograms.<br>link: https://lab174.com/nonoverse<br>countries:<br>- DE<br>- FR<br>- PL<br>- RO

Let’s verify that the above example parses correctly.

We’ll use Python1 with PyYaml2 version 6.0.3 (the<br>latest version as of this writing). First, let’s install it:

python3 -m pip install pyyaml==6.0.3

Now let’s write a simple script to parse the yaml file:

# python-pyyaml.py

import json<br>import yaml

with open("project.yaml", "r", encoding="utf-8") as f:<br>data = yaml.safe_load(f)

print(json.dumps(data, indent=2))

Running python3 python-pyyaml.py produces this<br>output:

"title": "Nonoverse",<br>"description": "Beautiful puzzle game about nonograms.",<br>"link": "https://lab174.com/nonoverse",<br>"countries": [<br>"DE",<br>"FR",<br>"PL",<br>"RO"

So far everything behaves as expected.

As of January 2026 Python is the world’s 4th most<br>popular programming language according to a 2025<br>Stack Overflow Survey (archive)↩︎

PyYaml is Python’s most<br>popular yaml library and a top 20 Python<br>library overall in the last month according to PyPI Stats (archive).<br>It is also an “official” yaml library in<br>the sense that its source code is hosted in a Github repository owned by<br>the yaml Github account; see: Canonical source repository for<br>PyYaml.↩︎

The Norway problem in yaml

When we change the original yaml file<br>and add Norway’s two letter iso country<br>code to the existing list:

countries:<br>- DE<br>- FR<br>- NO<br>- PL<br>- RO

Using the same parsing method, the file now yields this result:

"title": "Nonoverse",<br>"description": "Beautiful puzzle game about nonograms.",<br>"link": "https://lab174.com/nonoverse",<br>"countries": [<br>"DE",<br>"FR",<br>false,<br>"PL",<br>"RO"

Note that NO has been replaced with false.<br>This is unexpected. Nothing about the context suggests a boolean should<br>appear here. The NO literal sits in a list of country codes<br>like FR or PL and appears similar in form. The<br>problem, of course, is that “no” is also an English word with a negative<br>meaning.

This feature was originally added to allow writing booleans in a more<br>human readable way, e.g.:

platforms:<br>iPhone: yes<br>iPad: yes<br>AppleWatch: no

This gets parsed as:

"platforms": {<br>"iPhone": true,<br>"iPad": true,<br>"AppleWatch": false

The idea was that configuration files should read like natural<br>language. In practice this behavior proved problematic, becoming the<br>notorious Norway problem in yaml.

One workaround is to escape the string, like this:

countries:<br>- DE<br>- FR<br>- "NO"<br>- PL<br>- RO

With quotes, the file parses as expected:

"title": "Nonoverse",<br>"description": "Beautiful puzzle game about nonograms.",<br>"link": "https://lab174.com/nonoverse",<br>"platforms": {<br>"iPhone": true,<br>"iPad": true,<br>"AppleWatch": false<br>},<br>"countries": [<br>"DE",<br>"FR",<br>"NO",<br>"PL",<br>"RO"

Many articles about yaml’s Norway<br>problem stop here, presenting quoting as the canonical fix. There is<br>more.

Yaml’s<br>history

To understand today’s state of the Norway problem we’ll first look at<br>how yaml evolved.

May 2001 – Yaml first pass specification

At this time, yaml was more of a<br>concept than a finished language. It looked a bit different, though<br>somewhat recognizable. Below is a partial example from the original<br>specification; there are more in the full document, sadly none with<br>boolean values.

buyer : %<br>address : %<br>city : Royal Oak<br>line one : 458 Wittigen's Way<br>line two : Suite 292<br>postal : 48046<br>state : MI<br>family name : Dumars<br>given name : Chris

The document makes no mention of parsing no to<br>false. The “Serilization Format / bnf” section even contains a typo and a “to do”<br>note3:

This section contains the bnf4 productions for the yaml syntax. Much to do…

Full<br>first pass specification – archived link↩︎

Bnf stands for<br>“Backus–Naur form”, a notation system for syntax definition (Wikipedia).↩︎

January 2004 – Yaml v1.0 final draft

This version describes various ways of presenting scalars5, including both quoted scalars and<br>plain scalars with implicit typing. This is what we’re after.

Version 1.0 defined only sequence, map, and<br>string as mandatory types6.<br>The rest were optional, but a reference specification existed. That<br>reference specification for the optional boolean type included English<br>word format. Supported words were: true/false,<br>on/off, and also yes/no7.

This allows the Norway problem to appear – even if following that<br>part of reference is described as optional.

– Bonus: implicit typing can be overridden with explicit tags – we’ll<br>talk about this later.

– Bonus: single sign characters, i.e. + and<br>- should also be treated as true and<br>false; even more so, as they are described as the canonical<br>form8!

A scalar data type, or just scalar, is any non-composite value.<br>Generally, all basic primitive data types are considered scalar

source: Wikipedia↩︎

Following is a...

yaml norway problem nonoverse false countries

Related Articles