Two Years of OCaml
The other day I saw this post on OCaml discussed in Hacker News<br>and Lobsters.
Almost two years ago I rewrote the Austral compiler from Standard<br>ML to OCaml, so I thought I’d share my thoughts on OCaml after<br>using it in writing a complex software project, explaining what is good and what<br>is bad and how it compares mainly to Haskell.
If this seems overwhelmingly negative, it’s because the things OCaml does right<br>are really just uncontroversial. They’re obviously right and hardly worth<br>pointing out. It’s actually a weirdly optimistic thing: that a language with so<br>many glaring deficiencies stands far above everything else.
Contents
Syntax
Aesthetics
Declaration Order
Comments
Type Specifiers
Generic Types
Type Annotations
Semicolons Work Sometimes
Inconsistencies
Nested Match Expressions
Do Notation
Modules: Better is Worse
Modules Are Better
Modules Are Worse
Equality
Multiple Implementations Are Unnecessary
Semantics
Currying is Bad
Type Inference is Bad
Mutation
Pragmatics
PPX
Tooling
How Do I Profile?
Testing
Minor Complaints
At Least It’s Not Haskell
My OCaml Style
Should You Use OCaml?
Syntax
Yeah, yeah, de gustibus, and people spend way too much time whining about<br>syntax and other superficial issues, rather than focusing on language<br>semantics and pragmatics.
But I’m not a partisan about syntax. I genuinely think code written in C, Java,<br>Lisp, Pascal, and ML can be beautiful in different ways. Some of these<br>complaints will be personal, others will be more objective.
Aesthetics
ML was born as the implementation language of a theorem prover, so<br>naturally the syntax is meant to look like whiteboard math.
And it does look good for math. If you’re writing something like a symbolic<br>differentiation engine:
type expr =<br>| Const of float<br>| Add of expr * expr<br>| Sub of expr * expr<br>| Mul of expr * expr<br>| Div of expr * expr
let rec diff (e: expr): expr =<br>match e with<br>(* c' = 0 *)<br>| Const _ -><br>Const 0.0<br>(* (f + g)' = f' + g' *)<br>| Add (f, g) -><br>Add (diff f, diff g)<br>(* (f - g)' = f' - g' *)<br>| Sub (f, g) -><br>Sub (diff f, diff g)<br>(* (fg)' = f'g + fg' *)<br>| Mul (f, g) -><br>Add (Mul (diff f, g), Mul (f, diff g))<br>(* (f/g)' = (f'g - g'f)/gg *)<br>| Div (f, g) -><br>Div (Sub (Mul (diff f, g), Mul (f, diff g)), Mul (g, g))
Then it’s simply delightful. It does tend to fall apart for everything else<br>however.
OCaml, like Haskell, is expression-oriented, meaning that there is no<br>separation of statements (control flow, variable assignment) and expressions<br>(evaluate to values) and instead everything is an expression. Most expressions<br>in OCaml tend not to have terminating delimiters.
This is very vague, but ML-family (meaning Standard ML, OCaml, Haskell and<br>derivatives) code often feels like the expressions are “hanging in the air”, so<br>to speak. Terminating delimiters (like semicolons in C or end in<br>Wirth-family languages) make the code feel more “solid” in a way.
And expression orientation (which most modern languages advertise as a feature)<br>cuts both ways. The benefit is simplicity and symmetry: you don’t need both an<br>if statement and a ternary if expression. You can have a big expression that<br>computes a value and then assigns it to a containing let, like so:
let a: ty =<br>match foo with<br>| Foo a -><br>(* ... *)<br>let bar =<br>(* ... *)<br>(* imagine deeply nested expressions *)<br>in<br>(* etc *)
Without having to use an uninitialized variable or refactor your code into<br>too-small functions. However, this generality comes at a cost: you can write<br>arbitrarily deep and complex expressions, where a statement-oriented language<br>would force you to keep your code flatter and break it down into small<br>functions.
It takes discipline to write good code in an expression-oriented language. I<br>often see e.g. Common Lisp code with functions hundreds of lines<br>long. It’s almost impossible to track the flow of data in that context. This, by<br>the way, is why Austral is statement-oriented, despite every modern<br>language moving towards expression-oriented syntax.
Declaration Order
In OCaml, like in C, declaration must appear in dependency order. That is, you<br>can’t write this:
let foo _ =<br>bar ()
let bar _ =<br>baz ()
let baz _ =<br>print_endline "muh one-pass compilation"
Instead you must write:
let baz _ =<br>print_endline "muh one-pass compilation"
let bar _ =<br>baz ()
let foo _ =<br>bar ()
Alternatively, you can use and to chain your declarations:
let rec foo _ =<br>bar ()
and bar _ =<br>baz ()
and baz _ =<br>print_endline "muh one-pass compilation"
And the same thing is true of types:
type foo = Foo of bar
and bar = Bar of baz
and baz = Baz of unit
But, you can’t interleave an and-chain of functions with one of types. So<br>you have a choice:
You can write all of your code backwards, with the utility functions and the<br>leaf-nodes of the call graph up front, and the important code at the bottom.
Or, you can write a big and-chain of types at the start of the file,<br>followed by a big and-chain of functions for the remainder of...