Niels learns Rust 1 — Why Rust?
This is part 1 in my journey to learn Rust by porting my embedded Java virtual machine to it. Click here for the whole series.
I love learning new programming languages. A while ago, I made a list of all the languages I ever worked with, and got to 33. BBC Basic was my first, Scala the most recent addition.
In this series of posts I plan to describe my journey to add number 34: Rust.
Why learn yet another new language?
The more languages you know, the easier it becomes to pick up a new one. Patterns start repeating and improvements get more incremental. Getting the hang of functional programming takes a bit longer, but if you already know Haskell, Scala is easier to pick up and vice versa.
There is simply just a limited set of concepts. Each language picks a different subset and makes slightly different tradeoffs.
But I still always learn something new from picking up a new language. It may have some unique features: Scala’s implicits
were new to me. Or the community may take a slightly different approach to common problems: we can learn a lot from Golang’s concious choice not to support many features for the sake of simplicity.
Even if you never use a language in practice, knowing what’s out there makes you a better developer.
Why Rust?
So why learn Rust?
Because it does something fundamentally different from all the other languages I’ve worked with. It gives us a brand new approach to one of the oldest problems in programming: memory management. Rust’s approach requires a new way to think about your code, the subtleties of which will take much longer to master than Golang or Scala. And that’s what makes it interesting to me: an opportunity to learn and improve as a developer.
C and C++ are hard because they force you to manually manage your memory, with all the risks that come with it. This is error prone and lead to crashes and security vulnerabilities, but luckily we have a pretty good solution: garbage collection.
For most problems, the small performance price we pay for it is an easy choice, but there are cases where we can’t use garbage collection. For these, Rust now offers a new alternative, where we neither need to free()
memory ourselves, or depend on a garbage collector.
I’m also curious to see how learning Rust will influence the way I design my code. Rust’s way of doing memory management doesn’t come for free. It depends on very strict rules to ensure memory safety, and so requires more up front design. The promise is that this will lead to better code. Will it?
Rust’s approach to memory management
You can argue about whether Rust’s approach is really a new, third, solution to this problem. It doesn’t rely on a garbage collector and still automatically frees up memory for you, which at first made me think it was.
But having studied it for a bit longer I now prefer to think of it as another form of manual memory management. One that’s verified by the compiler, so any mistake you make is now a compile time error rather than a runtime crash or security vulnerability.
Yes, Rust will automatically free()
memory for you, but the fundamental problem isn’t deciding when to free memory. The problem is clearly defining who owns a piece of data.
Rust makes ownership and borrowing very explicit. You should have been thinking about this in your C or C++ code anyway, and in Rust the compiler forces you to do so.
Once you’ve structured your code in a way that conforms to Rust’s ownership rules, freeing up memory becomes trivial: a value can be freed (dropped in Rust terms), every time the variable that owns it
- gets assigned a new value, or
- goes out of scope.
Rust can do this because its ownership rules ensure there is always only one owner of a value, and no borrows (references) of a value will outlive the value itself.
This little example:
will print:
a: Foo { data: 42 }
b: Foo { data: 43 }
Dropped 43.
b: Foo { data: 42 }
Dropped 42.
The value Foo { data: 43 }
, originally stored in b
, is dropped at b = a;
because it lost its owner when b
was assigned a new value (reason 1).
At the same time the variable a
becomes unusable because its value Foo { data: 42 }
has been moved and is no longer owned by a
.
This value, Foo { data: 42 }
, now owned by b
, is dropped at the end of the main
function, when b
goes out of scope (reason 2).
Try it on the Rust playground.
That sounds rather restrictive
It is. There are many perfectly correct C programs that don’t conform to Rust’s strict ownership rules. The rules are limiting, but we know many examples where imposing limits on our code eventually leads to better designs.
There is a good reason we don’t like to use goto anymore, why we don’t make all class members public, why support for immutable data is getting more common, and why Golang limits itself to a minimalist set of features.
We impose these limits on ourselves because we believe it leads to better code in the long run, and Rust’s ownership restrictions are no different.
I’m only just starting to learn Rust, so I have no gut feeling for how limiting it will be in practice, but I suspect that after while thinking about ownership becomes second nature. And hopefully, code that doesn’t follow the rules will start to feel a bit messy.
If that’s the case, learning Rust will have made me a better developer in any language.
The project
So, I want to learn Rust. I picked up Programming Rust, 2nd Edition, which is excellent. It’s pretty dense at times (which I like), but it all makes sense to me when I read it. I did a bunch of the exercises, which aren’t too hard, but did I really get it? I didn’t feel I did.
To really learn a big language with new patterns like Rust, you have to do a project in it. Using it is the only way to really learn.
So I decided that porting my PhD work to Rust would be a good exercise. It’s a Java virtual machine for resource constrained embedded CPUs that uses Ahead-of-Time compilation. If you’re interested in the details, here’s my thesis, but for now it’s enough to know it’s a JVM developed for tiny devices with about 4KB RAM, and that it compiles the JVM bytecode to native code to improve performance.
Several things make this an interesting case study for Rust:
- A virtual machine is a good use case for a systems language. We can’t use garbage collection here, because it’s one of the things we have to build.
- It’s very low level. Compiling the JVM bytecode to native code means a lot of binary manipulation of memory and instructions, making sure every byte is in the right place.
- On such a restricted device, every byte counts. A byte spent on the VM can’t be spent on the application, so if it saves some memory, we prefer dirty tricks over nice abstractions.
- These devices often have a Harvard architecture. I’m curious to see how Rust handles this.
- There was at least one hard to find null pointer bug when I was developing this VM in C, which cost me two days to find. This is exactly what Rust promises to prevent.
I have no idea how long this project is going to take, but I expect to learn a lot from it. I’m particularly curious to what extent I can stay within Rust’s safety guarantees, and where I’ll have to resort to unsafe {}
code. This seems much more acceptable in Rust than in languages like C#, but of course it should still be avoided where possible.
I’ll keep writing new posts as the project progresses, so if you’re interested keep reading!
Tags: [rust]