April 26, 2013

Rust: What I learnt so far

Posted in Software at 22:58 by graham

This applies to 0.7pre, many things have changed in 0.8. Particularly core was renamed to std, and std renamed to extra.

Rust is an open-source programming language being developed mostly by Mozilla. It’s goal is the type of applications currently written in C++ (such as Firefox). Details at the Rust Wikipedia page.

I’ve been learning bits of it the past few days, and whilst Rust is still rough around the edges there’s a lot to enjoy. Rust is only at v0.7pre and changing daily, so you may have to adjust some of the code here.

Rust is a big language, and unless you come from C++ it will probably make your head hurt. In a good way :-)

The two most helpful introductions I have found so far are:

I’d encourage you to run through both of those, starting with Rust for Rubyists. When you get stuck reading one of them (and you will), switch back here.

Contents:

Install

At time of writing Rust is v0.7pre:

git clone git://github.com/mozilla/rust.git
cd rust
git checkout incoming
./configure
make
sudo make install

The make step will take a while and heat up your machine nicely.

Hello world

So far so normal. We have curly braces, semi-colons at end-of-line, fn to declare a function, and main as the entry point of executables. Let’s run it.

Compile: rustc hello_world.rs. This makes a regular binary, hello_world

Run: ./hello_world.

Instead of the compile / run cycle you can: rust run hello_world.rs.

You can even make your rust file a shell script by adding as the first line:

#!/usr/local/bin/rust run

(don’t forget to chmod +x hello_world.rs).

Ask your name

You declare a variable with let, and optionally give a type. The compiler will try and infer the type, and complain if it can’t. Here’s some variables:

let x;
x = 10;   // Compiler will infer  x: int
let pi: float;
let name: ~str;

The built-in types are what you would expect, plus some: List of Rust built-in types

The ~ is (briefly) explained in the next section. For now ~str is just how you declare a string on the heap. You use it for string literals too, as in let name = ~"Bob".

All variables are immutable by default, you can’t change their value once set.

let x;
x = 10;
x = 42;  // Compile error

You allow yourself to change a variable by prefixing it’s name with mut:

let mut x;
x = 10;
x = 32;  // All good

Each .rs file is a module. The line io::stdin().read_line() is using an io.rs file from the std library. Modules are grouped into crates (libraries) and on unix compile to a standard .so file. The std library, which contains the io module, is imported by default, which is why you don’t see an import statement here.

Modules are explained in the tutorial under crates and the module system.

The final new part in this section is fmt!. The ! means it’s a macro, i.e. it’s expanded by the compiler. fmt is a very useful macro, because it does the printf style formatting with all the %s, %d, etc that you would expect.

The most useful part is %? which formats anything. You will probably use println(fmt!("%?", thing)) quite a lot. Unlike C’s printf, fmt is type checked at compile time.

Memory management

The hardest part of Rust for me to understand is the memory management, and the three pointer types which declare it. Rust wants you to tell it how the memory for each pointer should be managed and checked.

~ and @ both mean you have a pointer, you are using heap memory. Without one of those you have a normal local variable (stack memory).

@ means several places can point at that memory, and you want Rust to track who points there and garbage collect the memory. It’s like pointers in Go, and what happens internally in Java and Python.

~ means one place owns that memory. For garbage collection Rust only needs to track who the current owner is, and whether that owner is still in scope. Other pointers can refer to this memory, by “borrowing” it (using the third type of pointer, &), but only as long as the owner is in scope.

The last line in that example println(name) won’t compile because by then you have given away the unique access to that piece of memory, to other. Change all the ~ to @ and it works.

This is the most exciting part of Rust for me, because I hope that once I get it, it will improve my programming in other languages too.

Because @ is the type of pointer you’re probably familiar with, right now I bet you’re thinking “I’ll just use @” everywhere”. But you can’t, because the standard library will return unique pointers (~), and you can’t just put those into managed pointers @.

You can however put either owned or managed pointers into the third type, the borrowed pointer, written &. Don’t worry, the compiler will tell you when to use one of those :-) In general the compiler is very good at telling you when you have to wrong type of pointer, so I just do what the compiler tells me and move on.

There’s lots of other good stuff in Rust, so don’t get too caught up in the memory management, at least at first. Onwards!

Looping

The loop syntax will look familiar to Ruby programmers. The two important features here are traits and closures.

Firstly, integers don’t have a method called times and vectors ([1,2,3] is a vector in Rust, meaning an array) don’t have a method called each. Those are added to the type by a trait, which I think is somewhere between an interface and a mixin. That can make it tricky to find which methods you can call on a given type.

Second, loops 1 and 2, and loops 3 and 4 are the same. for is just a nicer way of writing the loop beneath it.

Look at the documentation for times to see what I mean. See how it’s a method which takes a function? The first parameter will be familiar to Python programmers, it’s the object the method is called on. Ignore that, and look at the second parameter, which is a function with no arguments, returning a boolean.

Closures are well explained near the end of the Rust for Rubyists, Fizzbuzz chapter. They are essentially an anonymous function, declared by listing their parameters within ||, and their code in a block afterwards.

You give a closure to your iterator, and it calls it once each time through the loop. This will be familiar if you know Javascript.

The bool that the closure returns tells the iterator whether to keep running or not. With the for sugar, true is assumed unless you call break. With the un-sugared version you have to explicitly say true.

Finally (yes there’s a lot going on in Rust) notice that it just says true, not return true;. A block in Rust is an expression (as opposed to a statement), so it can evaluate to something. If you end your block with a semi-colon, it doesn’t have a value. If you don’t end with a semi-colon, it has the last thing you wrote. So you can set a variable like this:

let has_sanity = 
    if 1 == 1 { true }
    else { false }

By just saying true at the end of the closure you give to the iterator, the whole block evaluates to true, and the loop keeps going.

This is clearly explained in the tutorial’s Syntax basics section, in Expressions and semicolons.

Read a file

The loading itself is straightforward: Turn the string filename into a Path object (with path::Path), build a Reader object (with io:file_reader) and return all the lines (with file.read_lines()).

The interesting thing here is the Result object which wraps multiple returns, and is how error handling in Rust is often done.

Result (enmum has gone away in new version it seems) is a disjoint enumeration, containing either the success result (a Reader here), or an error (here a string). Many Rust methods return a Result, and Rust’s pattern matching is often used to check for errors.

A more “Rustic” way of writing this uses Pattern matching:

match is similar to switch. The Result enum contains two variants: Ok and Err.

Oh, and the angle brackets in Result<@Reader, ~str>? Yes, Rust has generics. Hopefully you know them from Java or C#. They’re a way of making static typing more flexible, so that you can for example define a Map which works on any type, but is still type checked by the compiler. The types in the map are specified when you create an instance of it.

Connect to a socket

First we declare that we’re using the std external crate (meaning library). Until now we were only using core, which is imported by default (this has changed in 0.8 core->std, and std->extra).

Then we declare which parts of std we use. We don’t have to do this, but it allows us to avoid prefixing everything with std::.

We turn the IP address we’re connecting to (US weather service) into an internal representation, connect to port 80, wrap the socket in a Reader (at the socket_buf call) and finally in the last line read everything and return it (sock.read_lines()).

Of particular interest is this line:

let iotask = uv::global_loop::get();

All IO in Rust is non-blocking, so that Rust can be highly concurrent using tasks, lightweight threads similar to Go’s go-routines, Python’s gevent greenlets, or node.js. It uses libuv for this.

Here we say to run the blocking I/O task on libuv’s global loop. Yeah, I don’t know what that means either. Let’s move on.

Objects

Objects in Rust are a struct to hold the data, plus some functions grouped in an impl block.

The first function new is static. That’s the preferred way to define constructors. It doesn’t have to be called new, and constructors are optional (we’re just making a struct).

The second and third functions are the methods. They take the object itself as their first argument &self, just like in Python.

In new and when we’re using a functional style, without explicit return statements. The other item of note is that you cast with as. We’re casting an i32 (32-bit int) coming from time::now().tm_year to a regular int (size machine dependant).

Use an external module – sqlite3

Let’s build and use an external module – a sqlite wrapper. Rust has a package manager called rustpkg which will install modules for you, but for now we’ll do it manually. Make sure you have SQLite 3 development files, package libsqlite3-dev in Ubuntu / Debian.

git clone git://github.com/linuxfood/rustsqlite.git
cd rustsqlite
rustc sqlite.rc

Compiling it will give you libsqlite-<something>.so. The .rc is just a convention for .rs files which contain libraries, and that convention may change soon. Anyway, let’s use that library:

First we declare usage of the sqlite library, just like we did for std previously.

In the let database = part we’re using pattern matching, and the functional style evaluation. In the Ok case, we just wrote db, so db gets returned. This behaves like database = db.

At line 12 we set result (a Result) to be mutable, because we re-use it at line 16.

There are more example of using sqlite in the test suite at the bottom of rustsqlite’s sqlite.rc.

To compile it you need to tell rustc where to find the library you are using. Assuming you copied libsqlite-<something>.so into the current directory, you just:

rustc -L . use_sqlite.rs

Mutable pointers

There’s one more bit related to memory management that might trip you up. Everything in Rust is immutable by default (constant). To make a variable mutable, you simply say mut in front of it.

let mut x: int;

Easy enough, right? The trick is that in the case of a managed (@) pointer, there are two things which can change – the data the pointer points to, or the pointer itself. This is also the case for unique pointers, but unique pointer contents inherit the mutability of the variable pointing to them. Managed pointers do not.

Multiple files

Each file is a module, and several files can make up a binary or library. This is module m2:

m2.rs

Names (functions, structs, etc) are private by default. Adding pub makes them public, accessible from other modules.

m1.rs

You reference other modules with mod <name>. By default module x is stored in file x.rs. Directories create a hierarchy of modules.

To compile: rustc m1.rs The compiler will include m2.rs automatically.

That’s everything I’ve learnt so far! More soon.

10 Comments »

  1. Frederick said,

    June 21, 2014 at 22:34

    I’ve tried to compile your code but I’m getting errors all over the place. My rust compiler keeps crying about the ~ and the str and so much other stuff. I don’t think I want to use rust if code you wrote only a year ago won’t compile today! That means that any code I write today will be unusable in a year unless I use a really really really old version of the rust compiler. Well I don’t want to do that so I think I just won’t use rust. It’s just too immature!

  2. Artur said,

    February 9, 2014 at 16:12

    Is it possible to have a module, lets call it foo with various files that belong to that module. Like this: src/foo src/foo/bar1.rs src/foo/bar2.rs

    bar1 would be a struct with its implementation and bar2 would be an other struct. I don’t want to keep everything in one file. You loose overview. But I don’t want an extra module bar1 and bar2 only for one struct. Thanks for any help.

  3. Austin King said,

    July 16, 2013 at 19:10

    I was stuck on a mutable pointer problem, thank you so much for this great post. Being able to have a struct pointer be immutable, but it’s elements be mutable is what I was missing, so that a callback could modify state.

  4. Evan Byrne said,

    July 4, 2013 at 01:49

    Nice overview of the language! Helped me out a good deal. Thanks!

  5. Graham King » We are all polyglots said,

    May 3, 2013 at 03:31

    […] We’ve been replacing C as our serious language since the 70s. C++ mostly succeeded, and became the official language of Microsoft Windows. Objective-C got a solid niche when Apple chose it for OSX, and later iOS. Java, became the serious language of web apps, and is now the language of Android. The two recent exciting developments here are Go and Rust. […]

  6. graham said,

    April 30, 2013 at 17:35

    @Richard Did you checkout ‘incoming’ branch? At the time of writing I think master was 0.6 and incoming 0.7pre. But in the past few days incoming has been merged to master. I moves fast!

  7. Mike Spadaru said,

    April 30, 2013 at 09:36

    Nice! Thank you!

  8. Richard said,

    April 30, 2013 at 04:26

    oops.. I forgot to mention that in the sqlite example “rustc -L . use_sqlite.rs” didn’t work (rust 0.6). i needed to remove ‘use’ from that because it thought use was a file.. i got “error: multiple input filenames provided”

    i followed your ‘script’ to clone from git.. I wonder why I didn’t get 0.7?

  9. Richard said,

    April 30, 2013 at 04:19

    well that was fun, that’s a bunch!

  10. Jonathan A Dunlap said,

    April 29, 2013 at 17:46

    Thanks for the great write-up on Rust! You covered a good number of the basics.

Leave a Comment

Note: Your comment will only appear on the site once I approve it manually. This can take a day or two. Thanks for taking the time to comment.