Two Years of OCaml

tosh1 pts0 comments

Two Years of OCaml

The other day I saw this post on OCaml discussed in Hacker News<br>and Lobsters.

Almost two years ago I rewrote the Austral compiler from Standard<br>ML to OCaml, so I thought I’d share my thoughts on OCaml after<br>using it in writing a complex software project, explaining what is good and what<br>is bad and how it compares mainly to Haskell.

If this seems overwhelmingly negative, it’s because the things OCaml does right<br>are really just uncontroversial. They’re obviously right and hardly worth<br>pointing out. It’s actually a weirdly optimistic thing: that a language with so<br>many glaring deficiencies stands far above everything else.

Contents

Syntax

Aesthetics

Declaration Order

Comments

Type Specifiers

Generic Types

Type Annotations

Semicolons Work Sometimes

Inconsistencies

Nested Match Expressions

Do Notation

Modules: Better is Worse

Modules Are Better

Modules Are Worse

Equality

Multiple Implementations Are Unnecessary

Semantics

Currying is Bad

Type Inference is Bad

Mutation

Pragmatics

PPX

Tooling

How Do I Profile?

Testing

Minor Complaints

At Least It’s Not Haskell

My OCaml Style

Should You Use OCaml?

Syntax

Yeah, yeah, de gustibus, and people spend way too much time whining about<br>syntax and other superficial issues, rather than focusing on language<br>semantics and pragmatics.

But I’m not a partisan about syntax. I genuinely think code written in C, Java,<br>Lisp, Pascal, and ML can be beautiful in different ways. Some of these<br>complaints will be personal, others will be more objective.

Aesthetics

ML was born as the implementation language of a theorem prover, so<br>naturally the syntax is meant to look like whiteboard math.

And it does look good for math. If you’re writing something like a symbolic<br>differentiation engine:

type expr =<br>| Const of float<br>| Add of expr * expr<br>| Sub of expr * expr<br>| Mul of expr * expr<br>| Div of expr * expr

let rec diff (e: expr): expr =<br>match e with<br>(* c' = 0 *)<br>| Const _ -><br>Const 0.0<br>(* (f + g)' = f' + g' *)<br>| Add (f, g) -><br>Add (diff f, diff g)<br>(* (f - g)' = f' - g' *)<br>| Sub (f, g) -><br>Sub (diff f, diff g)<br>(* (fg)' = f'g + fg' *)<br>| Mul (f, g) -><br>Add (Mul (diff f, g), Mul (f, diff g))<br>(* (f/g)' = (f'g - g'f)/gg *)<br>| Div (f, g) -><br>Div (Sub (Mul (diff f, g), Mul (f, diff g)), Mul (g, g))

Then it’s simply delightful. It does tend to fall apart for everything else<br>however.

OCaml, like Haskell, is expression-oriented, meaning that there is no<br>separation of statements (control flow, variable assignment) and expressions<br>(evaluate to values) and instead everything is an expression. Most expressions<br>in OCaml tend not to have terminating delimiters.

This is very vague, but ML-family (meaning Standard ML, OCaml, Haskell and<br>derivatives) code often feels like the expressions are “hanging in the air”, so<br>to speak. Terminating delimiters (like semicolons in C or end in<br>Wirth-family languages) make the code feel more “solid” in a way.

And expression orientation (which most modern languages advertise as a feature)<br>cuts both ways. The benefit is simplicity and symmetry: you don’t need both an<br>if statement and a ternary if expression. You can have a big expression that<br>computes a value and then assigns it to a containing let, like so:

let a: ty =<br>match foo with<br>| Foo a -><br>(* ... *)<br>let bar =<br>(* ... *)<br>(* imagine deeply nested expressions *)<br>in<br>(* etc *)

Without having to use an uninitialized variable or refactor your code into<br>too-small functions. However, this generality comes at a cost: you can write<br>arbitrarily deep and complex expressions, where a statement-oriented language<br>would force you to keep your code flatter and break it down into small<br>functions.

It takes discipline to write good code in an expression-oriented language. I<br>often see e.g. Common Lisp code with functions hundreds of lines<br>long. It’s almost impossible to track the flow of data in that context. This, by<br>the way, is why Austral is statement-oriented, despite every modern<br>language moving towards expression-oriented syntax.

Declaration Order

In OCaml, like in C, declaration must appear in dependency order. That is, you<br>can’t write this:

let foo _ =<br>bar ()

let bar _ =<br>baz ()

let baz _ =<br>print_endline "muh one-pass compilation"

Instead you must write:

let baz _ =<br>print_endline "muh one-pass compilation"

let bar _ =<br>baz ()

let foo _ =<br>bar ()

Alternatively, you can use and to chain your declarations:

let rec foo _ =<br>bar ()

and bar _ =<br>baz ()

and baz _ =<br>print_endline "muh one-pass compilation"

And the same thing is true of types:

type foo = Foo of bar

and bar = Bar of baz

and baz = Baz of unit

But, you can’t interleave an and-chain of functions with one of types. So<br>you have a choice:

You can write all of your code backwards, with the utility functions and the<br>leaf-nodes of the call graph up front, and the important code at the bottom.

Or, you can write a big and-chain of types at the start of the file,<br>followed by a big and-chain of functions for the remainder of...

ocaml expr code diff like expression

Related Articles