Rust 2020 Reading Journal

Apr-27-2020 - Rust Analyzer: First Release
Apr-30-2020 - Rust + nix = easier unix systems programming
May-22-2020 - Rust: Dropping heavy things in another thread can make your code 10000 times faster
Jun-27-2020 - Examining ARM vs X86 Memory Models with Rust

Apr-27-2020

Rust Analyzer is a IDE backend for rust that is implemented as a compiler frontend. It implements the language server protocol and adopts lazy and incremental compilation features of rustc.

It's now providing a better experience than RLS and wants to replace it as the official language server protocol implementation for Rust.

It has dedicated plugins for Vim, EMACS and VS Code. VS Code is the first class citizen here.

The bad

Doesn't use rustc directly. limited error detection
No cache persistence on disk.
Everything is in-memory.

Apr-30-2020

Title: Rust + nix = easier unix systems programming

I stumbled on this when searching for nix on Hacker News. It is actually talking about another nix - a rust library that aims to provide a unified interface for *nix platform APIs (Linux, BSD etc).

I like this qoute about systems programming:

Systems programming is programming where you spend more time reading man pages than reading the internet.

Using a C program that use the kill and fork system calls together, Kamal explains how things can get pretty bad due to C's return codes being conflated by kill and fork.

Putting the two together, that program could really ruin our day. If the fork() call fails for some reason, we store -1 in child. Later, we call kill(-1, SIGKILL), which tries to kill all our processes, and most likely hose our login. Not even screen or tmux will save us!

Enter nix, and rust.

Nix uses Rust enums to model the result of calling fork. See this.

This allows for pattern matching to deal with each case and prevent any conflation.

Also rust's Result type comes in handy for dealing with operations that might fail.

Fun fact: Kamal was one of my mentors during Rust Reach

May-22-2020

Title: Rust: Dropping heavy things in another thread can make your code 10000 times faster

An interesting optimization trick of deferring value dropping to a different thread.

The gist is that:

fn get_size(a: HeavyThing) -> usize {
    let size = a.size();
    std::thread::spawn(move || drop(a));
    size
}

is faster than when the drop occurs in the current thread. There is a gist that measures the performance improvements. It's interesting that the overhead of spawning a thread is low enough to allow for this to work, wow!

I would say always prefer passing by reference to passing by value.

Jun-27-2020

Title: Examining ARM vs X86 Memory Models with Rust

The way loads and stores to memory interact between multiple threads on a specific CPU is called that architecture’s Memory Model.

The memory model determines the order by which multiple writes by one thread become visible to other threads. And this is also true for a thread doing multiple reads. This out of order execution is done to maximize memory operations throughput.

Jim Keller explaining out-of-order execution.

Memory operations re-ordering by CPU affects the execution of multi-threaded code.

pub unsafe fn writer(u32_ptr_1: *mut u32, u32_ptr_2: *mut u32) {
    u32_ptr_1.write_volatile(58);
    u32_ptr_2.write_volatile(42);
}

pub unsafe fn reader(u32_ptr_1: *mut u32, u32_ptr_2: *mut u32) -> (u32, u32) {
    (u32_ptr_1.read_volatile(), u32_ptr_2.read_volatile())
}

If we initialize the contents of both pointers to 0, and then run each function in a different thread, we can list the possible outcomes for the reader. We know that there is no synchronization, but based on our experience with single threaded code we think the possible return values are (0, 0), (58, 0) or (58, 42). But the possibility of hardware reordering of memory writes affecting multi-threads means that there is a fourth option (0, 42).

This issue is easily fixed using Atomic types which present operations that synchronize updates between threads. Atomic memory orderings can further be used to specify the way atomic operations synchronize memory.

pub enum Ordering {
    Relaxed,
    Release,
    Acquire,
    AcqRel,
    SeqCst,
}

There is risk that Rust code might produce different (in an execution sense) assembly code on different CPUs.

The mapping between the theoretical Rust memory model and the X86 memory model is more forgiving to programmer error. It’s possible for us to write code that is wrong with respect to the abstract memory model, but still have it produce the correct assembly code and work correctly on some CPU’s.

Where the memory model of ARM differs from X86 is that ARM CPU’s will re-order writes relative to other writes, whereas X86 will not.

To ensure correctness, it's crucial to use the correct memory ordering in the Rust code.

Using the atomic module still requires care when working across multiple CPU’s. As we saw from looking at the X86 vs ARM assembly outputs, if we replace Ordering::Release with Ordering::Relaxed on our store we’d be back to a version that worked correctly on x86 but failed on ARM. It’s especially required working with AtomicPtr to avoid undefined behavior when eventually accessing the value pointed at.