Developing Statically Typed Programming Language · Minko Gechev's blog
In this blog post we’ll go through a sample implementation of a type checker, interpreter and a transpiler for a basic purely functional programming language, which is based on the lambda calculus. We will do a “full-stack” programming language development by going through formal definition of the language’s syntax, semantics and type system. After that we’ll demonstrate how we can “translate” these definitions to JavaScript.
Although the article doesn’t require any mathematical background, it’ll be useful to have high-level understanding of how compilers work. The article is inspired by the book “Types and Programming Languages”.
The source code of the implementation can be found at my GitHub profile.
Syntax
The syntax of the language is going to be quite simple.
Our programs will belong to the language defined by the following grammar:
t ::=<br>x # variable<br>λ x: T → t # called an abstraction by lambda calculus. In JavaScript we call it a function.<br>t t # application<br>true # true literal<br>false # false literal<br>if t then t else t # conditional expression<br>succ t # returns the next natural number when applied to `t`<br>prev t # returns the previous natural number when applied to `t`<br>iszero t # returns `true` if the argument after evaluation equals `0`, otherwise `false`<br>0 # symbol representing the natural number zero
The abstraction nonterminal allows us to declare functions with only one argument of type T and an expression t as a body. The syntax of the abstractions involves the unicode symbols → and λ . In case you have a Greek keyboard layout λ will be easy to type, however, in order to quickly type → you will need some kind of a keyboard shortcut. In other words, the syntax of our language is not very ergonomic.
Note that there’s no syntax construct for expressing negative numbers and on top of that succ and pred produce natural numbers. In order to be consistent with the grammar the result of pred 0 will be 0.
Regarding the type system, we will have two primitive types:
T ::= Nat | Bool
Notice that in contrast to other statically typed programming languages, such as Haskell or Elm, we do not have a syntactical construct for declaring the type of a function. In Haskell the semantics of the annotation T1 → T2 is a function which accepts one argument of type T1 and returns result of type T2. In our grammar, we do not have this explicit type annotation because we’re going to apply type inference in order to guess a function’s type based on the type of its argument and body.
Playground
Now after we’re familiar with the syntax, we can try to write some code. In the textarea below enter your program and click on the button “Evaluate” to see the result:
(λ a: Nat → a)<br>(if<br>(λ a: Nat → iszero a) pred 0<br>then<br>(λ a: Nat → succ a)<br>else<br>(λ a: Nat → pred a))
Evaluate
Result:
Semantics
Before going any further, lets show a few sample programs belonging to the programming language that we’re going to develop.
(λ a: Nat → succ succ a) 0
In the example above we declare an anonymous function which accepts a single argument called a of type Nat. In the body of the function we apply twice the predefined function succ to a. The anonymous function we apply to 0. This is going to produce 2 as final result.
The equivalent of this program in JavaScript will be:
(a => succ(succ(a)))(0)
Where succ is defined as:
const succ = a => a + 1
A more complicated program in our programming language will look like this:
λ f: Nat →<br>(λ g: Nat → f) 0<br>) (succ 0)
Lets explain the evaluation step by step:
Reduce the expression with left side (λ f: Nat → (λ g: Nat → f) 0) and right side (succ 0).
λ f: Nat →<br>(λ g: Nat → f) 0<br>) (succ 0)
Substitute g with 0 (apply beta reduction) in the expression (λ f: Nat → (λ g: Nat → f) 0) and get (λ f: Nat → f).
λ f: Nat → f<br>) (succ 0)
Increment 0 in the expression succ 0 and pass the result to (λ f: Nat → f).
λ f: Nat → f<br>) 1
Return result.
Formally, we can show the small-step semantics of our language in the following way:
Small-step semantics
In this section we’ll take a look at the small-step semantics of the language.
Lets suppose σ is the current state of our program, which keeps what value each individual variable has.
Variables
1)<br>─────────────<br>(x, σ) → σ(x)
1) simply means that for variable x we return its value kept in the state σ.
Built-in functions
The built-in functions in our language are succ, pred and iszero. Here’s the small-step semantics of succ and pred:
1)<br>t1 → t2<br>─────────────────<br>succ t1 → succ t2
2)<br>pred 0 → 0
3)<br>pred succ v → v
4)<br>t1 → t2<br>─────────────────<br>pred t1 → pred t2
1) simply means that if expression t1, passed to succ evaluates to t2, succ evaluates to succ t2.
2) we define that the result of pred 0 equals 0 in order to keep the language...