Comparing Objective Caml and Standard ML
Comparing Objective Caml and Standard ML
This page compares point-by-point the Objective Caml (OCaml) and Standard ML (SML) programming languages, the two main representatives of the ML programming language family. The comparison includes language design and current tool availability, as well as further social factors. This page isn't meant to be a complete comparison, but rather to highlight differences that might lead to choosing one language over the other.
For many of the points below, there is a clear distinction between the languages as to which is more "practical" in that design decision and which is more mathematically pure, concerned with formal properties of the language, or otherwise research-oriented. In such cases, these icons appear:
...next to the more "practical" language<br>...next to the more "pure" language
Objective Caml
Standard ML
Syntax
See this syntax comparison for more details.
Array/string shorthands
Special syntactic sugar is defined for array and string accesses.<br>These operations receive no special treatment.
let arr = [| 1; 2; 3 |];;<br>let two = arr.(1);;<br>arr.(2)<br>val arr = Array.fromList [1, 2, 3];<br>val two = Array.sub (arr, 1);<br>Array.update (arr, 2, 6);
val str = "Hello";<br>val e = String.sub (str, 1);
Arrays and strings are central data structures of "practical programming," so they should be as usable as we can make them.
More syntactic sugar clutters the language definition. Arrays and strings show up infrequently in traditional functional programming applications, and many new ML programmers accustomed to array-based work could quite profitably switch to datatype-based solutions instead.
Character literals
Uses 'c'<br>Uses #"c"
OCaml's syntax is shorter and follows the "standard" set by C.
Apostrophes mean "type variable" or "prime" in SML and are parts of identifiers; they shouldn't be confused with character literals. Many symbolically-oriented SML programs don't manipulate individual characters, so we shouldn't complicate the lexer to support related features. (Consider 'a', which could be either a type variable or a character literal, depending on where it appers.)
Identifier conventions
Module, module type, and constructor names must start in capital letters and other identifiers can't<br>No capitalization convention
type myType =<br>| B;;
let f = function<br>A' -> 0 (* A' is signaled as an unbound constructor. *)<br>| B -> 1;;<br>datatype myType =<br>| b;
val f = fn<br>a' => 0 (* a' is processed as a variable. *)<br>| b => 1; (* This case is signaled as redundant. *)
This convention stops a very nasty class of pattern matching bugs involving confusion between variables and variant constructors. It also eases the tasks of text editor syntax highlighters, making it easy to distinguish between module and variable names by color, for example.
More flexibility can't hurt if you're careful, right? In actuality, most SML programmers with opinions would prefer the OCaml convention.
Let-binding syntax
Separate top-level let and let expressions<br>Syntactic class of declarations and let..in..end construct for binding them
let six = 6<br>let rec fact x = if x = 0 then 1 else x * fact (x - 1)
let six_fact =<br>let six = 6 in<br>let rec fact x = if x = 0 then 1 else x * fact (x - 1) in<br>fact 6<br>val six = 6<br>fun fact x = if x = 0 then 1 else x * fact (x - 1)
val six_fact =<br>let<br>val six = 6<br>fun fact x = if x = 0 then 1 else x * fact (x - 1)<br>in<br>fact 6<br>end
In practice, this approach leads to some very confusing error messages, since the compiler is less able to predict what grouping you really intended.
Having a unified mechanism for top-level and local bindings leads to less duplication of functionality, and let..in..end seems empirically to lead to clearer error messages.
Overloaded "minus"
The standard - symbol is used for both negation and subtraction .<br>Tilde (~) is used for negation .
1 - 2;;<br>1 + -2;;<br>1 - 2;<br>1 + ~2;
Lots of programmers would be confused by throwing over this long-standing convention.
Differentiating subtraction and negation upholds the SML position that operators are just identifiers like any others that happen to be used via special syntax. Modulo the overloading of binary arithmetic operators, SML avoids situations where an identifier means different things in different contexts.
Semicolon precedence
Semicolon binds more tightly than match bodies and anonymous functions<br>Semicolon binds less tightly than case bodies and anonymous functions
match x with<br>0 -> print_endline "It's zero!";<br>true<br>| _ -> print_endline "It's not!";<br>false;;
fun s -> print_string s; s;;
begin match x with<br>0 -> print_endline "It's zero!"<br>| _ -> print_endline "It's not!"<br>end;<br>print_endline "The End";;
case x of<br>0 => (print "It's zero!\n";<br>true)<br>| _ => (print "It's not!\n";<br>false);
fn s => (print s; s);
case x of<br>0 => print "It's zero!\n"<br>| _ => print "It's not!\n";<br>print "The End\n";
The OCaml precedence rules favor imperative code, expecting semicolons...