Rust from a C++ and OCaml programmer's perspective (Part 1)
This summer, I decided to have a look at Rust, the new programming language that everyone is talking about. Rust was originally launched by Mozilla for their new Servo browser engine, and aims to provide safety guaranties without sacrificing on performance. Rust combines functional and imperative programming styles, so I wanted to compare it with my programming experience, notably in OCaml and C++.
To get started, I followed the Rust by Example tutorial, and practiced on the Rust playground. Based on this first experience, I want to share some thoughts about the Rust language, mainly compared to OCaml and C++, but also sometimes Python or Java.
In this first post, I’ll focus on the syntactic aspects, including control flow, error handling and closures (a.k.a. lambda functions). This post is illustrated with many code snippets!
Syntax
The first thing that I noticed about Rust is the simplicity and elegance of its syntax. It is a good mix of functional and imperative styles, providing advantages of the former (pattern matching, immutability by default) and the latter (procedural flow easy to follow and close to machine operations).
General syntax
The general syntax is somewhere in the middle between OCaml and C++.
Control flow shares the spirit of functional languages like OCaml: blocks (loops, conditionals) return a value, there is pattern matching, etc.
However, OCaml has many boilerplate keywords (such as if ... then ...
), whereas Rust simply uses semicolons and curly brackets (such as if ... { ... }
).
It may seem as syntactic sugar, but I find it clearer (think of nested if
blocks) and more concise (hence easier to read) than OCaml.
The syntax is also closer to widespread imperative languages such as C++, so easier to adopt for more people.
Here are a few examples of OCaml syntax, followed by the equivalent Rust syntax.
(* OCaml syntax *)
let x = a in b
if cond then a else b
while cond do a done
for x = a to b do c done
begin
match x with
| a -> u
| b -> v
end
In Rust, you don’t need to remember keywords such as then
, do
, done
, to
, with
, begin
, end
, etc.
Instead, statements are separated by semicolons and blocks are enclosed by curly braces.
Rust comments also follow the more conventional syntax of C++ and similar languages.
// Rust syntax
let x = a; b
if cond { a } else { b }
while cond { a }
for x in a .. (b+1) { c }
match x {
a => u,
b => v,
}
In both cases, line breaks and indentation are up to the programmer’s style.
As in OCaml and C++, semicolons are necessary to separate statements.
Under the hood, every statement or function call yields a value, and the semicolon discards that value.
Even the unit type (the same as in OCaml and similar to void
in C++) has a value ()
.
Subtle difference: Rust’s semicolons accept to eat any value, whereas in OCaml statements must be unit values.
So there is an ignore
function in OCaml to transform anything into unit.
Edit: as discussed on Twitter, Rust types can be marked as #[must_use]
to prevent being swallowed by a semicolon.
This is for example the case of the Result
type, to prevent the caller of a function from ignoring an error (see the next section in this post).
The syntax for type declaration is the same as in OCaml: let a : Type = value
where the type annotation Type
is after the variable name a
.
Type annotations are often optional, but contrary to OCaml, there is no type inference in Rust so you need annotations in function arguments, or when no initial value is provided.
This behavior is similar to auto
in C++11.
Loops
Besides while
loops and range-based for
loops, Rust supports infinite loops with the loop
keyword, and break/continue
(contrary to OCaml).
You can even break out of several loops at once, thanks to labeled loops.
And break
can return a value.
By the way, Rust checks for (trivially) unreachable code!
#![allow(unreachable_code)]
fn f() {
'outer: loop {
'inner: loop {
break 'outer
}
println!("Unreachable!");
};
let result = loop {
break 123
};
println!("The loop returned {}", result)
}
This is different than OCaml, for which we need to raise an exception to break out of a loop.
String formatting
The syntax for string formatting is inspired from Python.
# Python
a = 123
print "a = {}".format(a)
// Rust
let a = 123;
println!("a = {}", a);
However, Rust checks the string format at compile time, so the next examples won’t compile. Under the hood, this « magic » compile-time checking happens thanks to Rust’s extensive macro system, which seems quite powerful!
fn main() {
println!("a = {}");
}
// error: invalid reference to argument `0` (no arguments given)
// --> src/main.rs:2:5
// |
// 2 | println!("a = {}");
// | ^^^^^^^^^^^^^^^^^^^
// |
fn main() {
let s = "a = {}";
let a = 1;
println!(s, a);
}
// error: expected a literal
// --> src/main.rs:4:14
// |
// 4 | println!(s, a);
// |
Pattern matching shortcut
Rust also supports a useful destructuring syntax for pattern matching: if let
and while let
.
Instead of writing a pattern matching switch with only one interesting case and a default case, you can just destructurate your value with if let
.
// Classic pattern matching
match x {
Some(y) => /* foo */,
None => (),
}
// Equivalent simple version
if let Some(y) = x {
/* foo */
}
This pattern is indeed quite common in real-world projects. For example, consider the following excerpt, that I wrote in the Caradoc project.
begin
match length with
| Some n ->
if (Array.length a) <> n then
raise (Errors.PDFError (error_msg, ctxt))
| None -> ()
end;
Here, length
is an optional value that either sets a constraint (Some n
) on the length of array a
, or no constraint (None
).
With the if let
syntax, this excerpt could roughly be rewritten as follows.
if let Some(n) = length {
if a.len() != n {
// Handle the error
}
}
The if/while let
syntax is somewhat similar to the new selection statements with initializer in C++17.
Let’s hope that OCaml will support a similar feature as well one day!
Error handling
Error handling in Rust is somewhat special, because contrary to widespread languages such as C++, OCaml, Python or Java, Rust has no exceptions! Instead, there are two ways to handle errors.
Unrecoverable errors
These errors are signaled with the panic!
macro, which essentially terminates the program.
Panic still unwinds the stack and cleans up everything by default (i.e. calls destructors).
You can also explicitly abort the program on panic (more on this in the Rust book).
fn main() {
panic!("Let's crash the program!");
}
// thread 'main' panicked at 'Let's crash the program!', src/main.rs:2:4
// note: Run with `RUST_BACKTRACE=1` for a backtrace.
Recoverable errors
Contrary to common programming languages, recoverable errors are not signaled by exceptions, that magically fly to the next catch block (generally) outside of the current function. Instead, errors are propagated through return values of functions that fail.
For this purpose, Rust defines a generic type Result<T, E>
, where T
is a success type, and E
is an error type.
In other words, functions must indicate the possible errors in their return type; this is somewhat similar to mandatory annotations of exceptions in Java functions.
use std::result::Result;
use std::num::ParseIntError;
fn parse_and_double(number_str: &str) -> Result<i32, ParseIntError> {
let parsed = number_str.parse::<i32>();
match parsed {
Err(foo) => Err(foo),
Ok(n) => Ok(2*n),
}
}
Although passing errors by result means that you need to handle possible errors at every step of your program, Rust defines useful functions to help us.
For example, Result
has an unwrap()
method to extract a success value or panic on error, map()
and and_then()
methods to apply a closure to a success value while keeping an error unchanged, etc.
The try!
macro extracts the success value of a result, but instead of panicking on error, it returns this error for the current function.
In the above example, parsing a string into a number can fail (if the string does not contain a number), but we want to further process a successfully parsed number.
This can be rewritten as follows.
fn parse_and_double(number_str: &str) -> Result<i32, ParseIntError> {
let n: i32 = try!(number_str.parse::<i32>());
Ok(2*n)
}
In case of parsing error, the try!
macro directly forwards it as the function result; in case of success, n
contains an integer of type i32
and not a Result
.
From my experience in parsing user inputs, notably with Caradoc (a tool to parse and validate PDF files) and Tyrex (a hex editor with parsers for some file formats), error handling is a complex problem, especially for badly designed file formats. I will need more practice to see if Rust’s error paradigm is practical and powerful, but it looks promising!
Closures
Closures a.k.a. lambda functions are everywhere in functional languages like OCaml.
They are also in C++ since the 2011 standard.
Rust supports them, with a simple syntax: arguments are enclosed by vertical bars | |
, and the body by curly braces { }
(unless there is only one statement and braces are optional).
For example, here are closures that double a number, in these 3 languages.
fun x -> 2*x # OCaml
[](int x) {return 2*x;} # C++
|x| 2*x # Rust
Rust syntax is quite efficient!
But contrary to C++, there is no explicit control on which variables are captured.
Rust simply tries to capture variables by reference, mutable reference or by value (in this order).
There is also a move
keyword to capture all variables by value, i.e. explicitly taking their ownership (I will discuss references and ownership in a follow-up blog post).
Digression: function call syntax
For function calls, Rust adopts the imperative syntax, similar to C++.
The caller simply encloses arguments between parentheses, e.g. f(a, b)
.
This is different than common functional languages like OCaml, for which arguments are simply put after the function, e.g. f a b
.
In fact, OCaml functions only have one argument.
Multi-argument functions are implemented with currying: the function maps the first argument to another function, which itself maps the second argument to the result.
For example, an addition function of type int -> int -> int
is in fact of type int -> (int -> int)
, i.e. maps an int
to a function of type int -> int
.
Here is an example of currying in OCaml.
(* OCaml *)
let add x y = x+y in
let f = add 3 in
f 2 (* 3+2 *)
In C++ and Rust, functions are naturally multi-arguments.
This also means that we cannot simply bind an argument (e.g. the number 3
) to a function (add
) to obtain a new function (f
); we need to define a new closure/lambda (or use the std::bind
function in C++11).
// C++
auto add = [](int x, int y) {return x+y;};
auto f = [add](int y) {return add(3, y);}; // We need to capture "add"
f(2); // 3+2
// Rust
let add = |x, y| x+y;
let f = |y| add(3, y);
f(2); // 3+2
The syntax of currying does not clearly distinguish functions from values. In fact, in pure functional languages there is no difference between values and functions. For example, the lambda calculus only has functions, but smart constructions allow to implement « values » (booleans, numbers, tuples, lists) with functions! For more information, I recommend the material of the foundation of software course that I followed at EPFL, taught by Martin Odersky, the designer of Scala.
What’s next?
Even though I haven’t written a useful program in Rust yet, I am already excited about this language! It looks really promising and carefully designed, with native performance, good static checks by the compiler and an active community. Besides an in-progress web browser, there are a package repository, several conferences, active development of the language… I will soon finish a follow-up post about other aspects of the language, and you will certainly hear about my future programming projects in Rust!
Comments
To react to this blog post please check the Twitter thread.
You may also like