Bicameral, Not Homoiconic

andsoitis1 pts0 comments

Bicameral, Not HomoiconicBicameral, Not Homoiconicđź”—<br>1 (Weak) Homoiconicity

2 (Strong) Homoiconicity

3 The Parsing Pipeline

4 The Bicameral Analogy

5 Bicameral Syntax

6 Back to Lisps

7 What About Other Languages?<br>If you spend enough time reading internet discussions of programming<br>languages, you&rsquo;ll learn that Lispy languages have a special property:<br>they are homoiconic. This property is vested with mystical<br>powers that both enrich Lisps and debase its competitors.<br>I have programmed in, and built, Lisps since the late 1980s. My blog<br>is called &ldquo;parenthetically speaking&rdquo;. And yet I&rsquo;m here to tell you<br>that this term is mostly nonsense. However, there is something<br>special—something far less mystical, but also very powerful and<br>useful—about Lisps. It&rsquo;s worth understanding what that is and<br>transporting its essence to other languages.<br>1 (Weak) Homoiconicity🔗<br>What, supposedly, is homoiconicity? You will hear things like: the<br>property that &ldquo;a property of some programming languages that allows<br>programs to be represented as data within the language&rdquo;, or with<br>&ldquo;represented&rdquo; substituted by &ldquo;manipulated&rdquo;, or more simply as<br>&ldquo;code as data&rdquo;.<br>Let&rsquo;s tease these apart a bit. Consider the following Python code:

hello = 1

This is clearly a program. But can I represent this as a datum<br>within the language? Sure:

'hello = 1'

is a perfectly good representation. (Well, it may be good but<br>it&rsquo;s not great; we&rsquo;ll return to that!) Can I manipulate<br>it? Sure, I can concatenate strings to create it:

'hello' + ' = ' + '1'

will produce that program, and

'hello = 1'.split(' ')

will take it apart into constituent pieces.

Does that make Python homoiconic?<br>Of course, there&rsquo;s nothing special about Python here. We can use<br>JavaScript to represent and manipulate JavaScript programs, C to do<br>the same to C programs, and so on. Essentially, any programming<br>language with a string datatype seems to be homoiconic. Heck, we<br>didn&rsquo;t even need strings: we could just as well have represented the<br>programs as numbers (e.g., using<br>Gödel numbering).<br>One of the traits of a good definition is that it be non-trivial: it<br>must capture some things but it must also exclude some things. It&rsquo;s<br>not clear that this notion of homoiconicity excludes much of anything.

2 (Strong) Homoiconicity🔗<br>But there&rsquo;s a reasonable objection to what we wrote above. All that<br>we&rsquo;ve done is written, combined, and taken apart strings. But<br>strings are not necessarily programs; strings are just strings,<br>a form of data. Data are data, but programs—entities<br>that we can run—seem to be a separate thing.<br>How do we turn data into programs? We do need some language support<br>for that. We need something that will take some agreed-upon data<br>representation of a program and treat it like a program,<br>i.e., do whatever the program would have done. Typically, this is a<br>function called eval: it evaluates the datum, performing the<br>effects described in the datum, just as if it were a program. (Note<br>that eval really treats &ldquo;data as code&rdquo;, not &ldquo;code as<br>data&rdquo;.)<br>So maybe eval is the real characteristic of homoiconic<br>languages? Maybe. It&rsquo;s certainly true that eval is a<br>distinctive feature, and some languages have it while others don&rsquo;t:<br>that is, it non-trivially distinguishes between languages. But it&rsquo;s<br>worth noting:

Many languages, including Python and JavaScript, have an<br>eval. If they&rsquo;re all homoiconic, then clearly this isn&rsquo;t a<br>particularly Lispy trait.

eval interacts poorly with its lexical environment,<br>thereby making it hard to even program with effectively.<br>We showed that JavaScript&rsquo;s<br>eval is not one but four operations and there are eight<br>contexts that determine which of the four to use. This kind of<br>complexity is overwhelming.

The complexity might be worth it if eval were a good<br>idea, but it&rsquo;s often a bad idea in programs! It makes code statically<br>invisible, making every other aspect of program management—static<br>analysis, compilation, security checking, and more—much, much harder<br>(or, for some important and useful kinds of analysis, impossible).

This seems like a disappointing way to end: homoiconic languages are<br>ones that have a complex, excessively-powerful feature that we<br>probably shouldn&rsquo;t use but is anyway found in many languages that are<br>not Lispy at all…which certainly doesn&rsquo;t seem to be a good way to<br>describe what makes Lispy languages distinctive.

But this just shows why we shouldn&rsquo;t be talking about homoiconicity at<br>all. Let&rsquo;s talk about what&rsquo;s actually interesting instead.

3 The Parsing Pipelineđź”—<br>Let&rsquo;s talk briefly about the classical parsing pipeline. For decades,<br>we&rsquo;ve been taught to think of parsing a program as having two phases:<br>tokenization (sometimes colloquially called &ldquo;lexing&rdquo;) followed by<br>parsing (not colloquially called...

rsquo languages data program homoiconic ldquo

Related Articles