tokio study notes

·

Some notes on tokio and async rust. I took these while working through the excellent tokio tutorial.

Overview

Tokio is an event-driven, non-blocking I/O platform for writing asynchronous applications with the Rust programming language. It excels at I/O tasks, where each individual task spends most of its time waiting for I/O.

It provides the building blocks for networking applications. It comprises of:

  • a multi-threaded runtime for executing async code
  • an async version of the standard library
  • an ecosystem of additional libraries

Not suitable for:

  • parallelization of cpu tasks (use rayon)
  • singular http requests (use reqwest)
  • reading a lot of files (use a thread pool)

Async in Rust

Asynchronous programming in rust is expressed through async/await. Rust compiles async fn into asynchronous routines, i.e. control is immediately returned to the thread. Rust's async operations are lazy, i.e. will only be executed once await is called.

The basics are documented in the book:

tokio

Builds on top of Rust primitives

#[tokio::main]
async fn main () {
	println!("hello there");
}

is transformed into

fn main() {
	let mut rt = tokio::runtime::Runtime::new9).unwrap();
	rt.block_on(async{
		println!("hello")
	})
}

Tasks

A Tokio task is an asynchronous green thread (a thread scheduled by the runtime/VM, not by OS.) A task is created by passing an async block to tokio::spawn, e.g.:

tokio::spawn(async move {
	process(socket).await;
});

move here is used to move the closure's environment into it. This means socket will be moved. A task's type's lifetime must be 'static', i.e. it must not contain any references to data owned outside it. If data must be shared, a synchronization primitive such as Arc needs to be used.

tokio::spawn returns a JoinHandle which can be awaited to obtain a Result.

Tasks must implement the Send trait to move the tasks between threads. This is a rust concurrency thing. (see chapter 16 in the book).

Tasks in Tokio are very lightweight. Under the hood, they require only a single allocation and 64 bytes of memory. Applications should feel free to spawn thousands, if not millions of tasks.

Shared State

Two approaches:

  • guard the shared state with a mutex (std)
  • spawn a task to manage the sate and use message passing to operate on it

Generally speaking: use the first approach. The second comes into play with I/O. The usual contention considerations apply.

Top-level mutexes in a task need to be handled specially, since they do not implement Send: Either wrap in a scope block or extract into a non-async function. Alternatively use a tokio async mutex (tokio::sync::Mutex), but this incurs a performance hit.

Handling contention

Use sharded mutexes.TIL: Java's ConcurrentHashmap is a sharded mutex.

Channels

Why use channels?

  • Tasks can't share resources freely (ownership)
  • Using a mutex would introduce bottlenecks
  • The alternative of creating connections per task is undesirable (why?) => Pattern: Message sending over channels

Types of channels:

  • mpsc: multi-producer, single-consumer channel. Many values can be sent.
  • oneshot: single-producer, single consumer channel. A single value can be sent.
  • broadcast: multi-producer, multi-consumer. Many values can be sent. Each receiver sees every value.
  • watch: multi-producer, multi-consumer. Many values can be sent, but no history is kept. Receivers only see the most recent value.

The standard library also has channels, but these are blocking. The async-channel crate has multi-producer, multi-consumer channels, where only one consumer sees each message.

Further study

  • https://tokio.rs/tokio/tutorial/io
  • https://tokio.rs/tokio/tutorial/async