Reaching the (current) limits of Rust's type system with asynchronous programming | Blog

I recently wanted to implement some simple command-line tool to monitor devices connected to my home router, and help troubleshoot connection issues. My goal was to create a simple TUI listing properties of connected devices (e.g. MAC/IP addresses, WiFi speed), and automatically refreshing every few seconds. This was to provide an alternative to the default web interface provided by my ISP’s router.

I decided to implement this tool in Rust, in particular to get some practice on the async/.await syntax that was stabilized about a year ago. The result is available on GitHub.

However, what started as a simple experiment ended up being a longer journey, leading me to the current limits of Rust’s type system, exploring lifetimes, traits, futures and Higher-Rank Trait Bounds. In this blog post, I want to share my take-aways as well as some open questions that I reached when mixing all of these types together.

Overview
- Code structure and useful crates
- Remarks
Asynchronous programming with async/await: a few examples
Traits over async functions
Implementing a generic retry loop over an async function
Conclusion

Interactive demo of my monitoring tool, recorded with asciinema (click to replay the recording).

Overview

Without going into the details of my router’s specifics, the general architecture of my monitoring tool is the following.

Code structure and useful crates

Main (main.rs): Every Rust program starts with a main function. In this case, it initializes the Tokio runtime for asynchronous programming.
Command-line interface (cli.rs): Thanks to the clap crate, it’s quite easy to define command-line parameters such as the IP address of the router, authentication parameters, refreshing frequency, timeouts, etc. This library takes care of everything: parsing and validating the command-line arguments, generating help, as well as suggestions if the user made a typo!
Fetching data from the router (connect_box.rs): Thanks to the reqwest crate, we can define an HTTP client, setup cookies and timeouts, and fetch monitoring information from the router.
Parsing results from the router (types.rs): The router returns information about connected devices in XML format. Thanks to the serde_derive and serde-xml-rs crates, one can just define the XML schema as Rust types, and serde will take care of generating a parser for it.
Text-base user interface (tui.rs): The ncurses crate provides binding over the ncurses library. There’s not a lot to say about it, but it allows to create user interfaces directly in the terminal.
Demonstration (demo.rs): I also created a fake implementation of the router interface, which returns some dummy values for demonstration purposes. It’s useful to test the UI, and to try the tool without having my router model.

Remarks

The env_logger crate allows to seamlessly add logging statements in the program and then choose what to print at runtime, depending on the $RUST_LOG environment variable. For example, RUST_LOG=debug will print all messages with debug priority or higher, while RUST_LOG=connect_box=debug will only print debug messages coming from the connect_box crate/module. This can be very useful to enable different logging levels for different crates/modules.
Traits are a very powerful tool of Rust’s type system - similar to Java’s interfaces. Once I implemented my program, I decided to add some demo router implementation to serve fake values. By creating an additional router trait, I could easily inject this demo implementation in my program without having to do a lot of code refactoring. There are however some caveats, which we’ll discuss below.
Defining automatic parsers from a JSON or XML schema with serde_derive is quite powerful. However, be careful that the auto-generated parsers will silently ignore unknown fields, but break parsing at runtime if some field that you thought required is actually missing. I didn’t investigate further, but that’s at least the default behavior.

Asynchronous programming with `async`/`await`: a few examples

I mentioned in the introduction that one goal of this project was to get some hands-on experience with Rust’s async/await programming model. Here are some examples of asynchronous patterns I used in this project, and how to implement them in Rust.

Throttling requests

A common thing to do is to throttle asynchronous requests to a certain frequency, for example we want to refresh the dashboard every 5 seconds. A naive way to do it could be to sleep 5 seconds after each request, but this ignores the variable delay that each request takes to complete.

Instead, one can use the throttle primitive provided by the Tokio library. In Tokio version 0.2.x, a request loop with throttling between iterations looks like the following.

use futures::stream;
use tokio::time;

let mut throttle = time::throttle(time::Duration::from_secs(5), stream::repeat(()));
loop {
    make_request().await;
    throttle.next().await;
}

The throttle object makes sure that 5 seconds have elapsed between two calls to .next().await on it (and yields immediately if make_request took more than 5 seconds).

Gracefully exit upon interruption

Another useful thing is to handle interruption signals (e.g. when the user presses Ctrl+C) in a graceful manner. For example, we want to exit the UI loop but do some cleanup such as logging out from the router afterwards.

This can be done via the select! macro from the futures crate, together with Tokio’s signal module.

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    select!(
        res = main_loop().fuse() => res,
        res = wait_interrupt().fuse() => res,
    )?;

    cleanup()
}

async fn wait_interrupt() -> Result<(), Box<dyn std::error::Error>> {
    tokio::signal::ctrl_c().await?;
    info!("CTRL-C received!");
    Ok(())
}

Rust and incompatible versions for dependencies

I previously mentioned that I used the reqwest crate to handle all asynchronous operations over the network, i.e. HTTP requests to the router. This crate provides async functions, and therefore we need to provide an asynchronous runtime, such as Tokio, to be able to use it.

One caveat though is that reqwest is already tightly integrated with Tokio version 0.2.x, and in particular depends on it. There is a newer version 0.3.x of Tokio, but as the semantic versioning suggests, it is incompatible with 0.2.x.

In principle, the Rust compiler won’t prevent us from depending on Tokio 0.3.x in our Cargo.toml manifest, as shown in this commit and the corresponding compiler output. However, if we do that, our program will be compiled with both Tokio 0.3.x (which we explicitly depend on) and Tokio 0.2.x (which we indirectly depend on via reqwest). And although this compiles, we obtain the following error at runtime!

thread 'main' panicked at 'there is no timer running, must be called from the context of Tokio runtime', /home/user/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.23/src/time/driver/handle.rs:25:14
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

When running with RUST_BACKTRACE=1 as suggested, we obtain the following backtrace for this error.

thread 'main' panicked at 'there is no timer running, must be called from the context of Tokio runtime', /home/user/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.23/src/time/driver/handle.rs:25:14
stack backtrace:
   0: rust_begin_unwind
             at /rustc/18bf6b4f01a6feaf7259ba7cdae58031af1b7b39/library/std/src/panicking.rs:475
   1: core::panicking::panic_fmt
             at /rustc/18bf6b4f01a6feaf7259ba7cdae58031af1b7b39/library/core/src/panicking.rs:85
   2: core::option::expect_failed
             at /rustc/18bf6b4f01a6feaf7259ba7cdae58031af1b7b39/library/core/src/option.rs:1213
   3: tokio::time::delay::delay_for
   4: reqwest::async_impl::client::Client::execute_request
   5: reqwest::async_impl::request::RequestBuilder::send
   6: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
   7: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
   8: std::thread::local::LocalKey<T>::with
   9: tokio::runtime::enter::Enter::block_on
  10: tokio::runtime::thread_pool::ThreadPool::block_on
  11: tokio::runtime::Runtime::block_on
  12: connect_box::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

We can see (at the very end of the first line) that the error comes from the tokio-0.2.23 crate, whereas our main function was (likely) using tokio-0.3.4 as specified in the Cargo.toml.

Traits over `async` functions

I previously mentioned that traits are a powerful language construct of Rust for dependency injection, for example for mocking a demonstration API. This is what I did for my demo, but there was one caveat: the API I wanted to mock was using async functions, and it turns out that Rust doesn’t yet support traits over async functions¹.

The corresponding compiler error is the following.

error[E0706]: functions in traits cannot be declared `async`
 --> src/router.rs:5:5
  |
5 |     async fn logout(&mut self) -> Result<(), Box<dyn std::error::Error>>;
  |     -----^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |     |
  |     `async` because of this
  |
  = note: `async` trait functions are not currently supported
  = note: consider using the `async-trait` crate: https://crates.io/crates/async-trait

However, as suggested by the error message, there is the async-trait crate which provides some degree of support at the moment, and this is what I used in my router trait. If you look more closely, you’ll notice that I used the ?Send parameter on the async_trait attribute.

#[async_trait(?Send)]
pub trait Router {
    async fn foo(x: Foo) -> Bar;
    // ...
}

Indeed, if I remove this parameter, I obtain the following error.

#[async_trait]
pub trait Router {
    // ...
}

error: future cannot be sent between threads safely
   --> src/connect_box.rs:31:85
    |
31  |       async fn devices(&mut self) -> Result<LanUserTable, Box<dyn std::error::Error>> {
    |  _____________________________________________________________________________________^
32  | |         let xml = self.get(ConnectBox::CMD_DEVICES).await?;
33  | |         trace!("XML: {}", xml);
34  | |         let result = serde_xml_rs::from_str(&xml)?;
35  | |         Ok(result)
36  | |     }
    | |_____^ future returned by `__devices` is not `Send`
    |
    = help: the trait `std::marker::Send` is not implemented for `dyn std::error::Error`
note: future is not `Send` as this value is used across an await
   --> src/connect_box.rs:156:17
    |
145 |             let res = self.get_impl(function).await;
    |                 --- has type `std::result::Result<std::string::String, std::boxed::Box<dyn std::error::Error>>` which is not `Send`
...
156 |                 self.reset().await?;
    |                 ^^^^^^^^^^^^^^^^^^ await occurs here, with `res` maybe used later
...
160 |         }
    |         - `res` is later dropped here
    = note: required for the cast to the object type `dyn futures::Future<Output = std::result::Result<types::LanUserTable, std::boxed::Box<(dyn std::error::Error + 'static)>>> + std::marker::Send`

The reason for this compilation error is that the Send trait is not implemented by the dyn std::error::Error type, which appears in the result of the async function that we want to put in our trait. In other words, the future that corresponds to our async function is not thread-safe.

Without going into the details about why this is the case, a simple solution was to use #[async_trait(?Send)], as explained in the async_trait crate’s documentation.

Implementing a generic retry loop over an `async` function

The last thing I want to discuss, and where we’ll really reach the limits of Rust’s type system, is how to implement a generic retry loop over some asynchronous function.

The setting is relatively common: some asynchronous operation - typically a network request - may fail for various reasons, and depending on the reason we may want to retry it. For example, for my monitoring tool, we refresh the dashboard periodically, but if fetching data fails due to a timeout or a network connection issue, then we simply want to retry again until it works and we can update the dashboard.

A simplified view of this pattern is the following.

// Manages the state of a network connection.
struct State {}

impl State {
    // Some asynchronous operation that may fail.
    async fn operation(&mut self) -> bool {
        unimplemented!()
    }

    // Retry the operation until it succeeds.
    async fn retry_operation(&mut self) {
        loop {
            if self.operation().await {
                return;
            }
        }
    }
}

Now, in practice, the exact condition to retry or abort can be a bit more complicated than a boolean, and the loop may contain more attributes, such as the throttling that I mentioned above. Besides, we can have multiple types of operations that we want to retry in roughly the same configuration, for example HTTP requests (both GET and POST), logging in to the router, etc.

So it would be nice to have a generic retry loop that can be reused for various operations.

use std::future::Future;

impl State {
    // Naive attempt, which doesn't compile.
    async fn retry_loop<F, Fut>(&mut self, mut f: F)
    where
        F: FnMut(&mut Self) -> Fut,
        Fut: Future<Output = bool>,
    {
        loop {
            if f(self).await {
                return;
            }
        }
    }

    async fn retry_operation1(&mut self) {
        self.retry_loop(State::operation1)
    }

    async fn retry_operation2(&mut self, parameter: Foo) {
        self.retry_loop(|this| this.operation2(parameter))
    }

    // [...] Definition of asynchronous operations 1 and 2.
}

The above draft fails to compile, with errors related to lifetimes. As always in this kind of situation, the best is to try to minimize the problem to find a solution, and then apply it back in one’s actual code.

In my case, this was complex enough that I asked it on the Rust users forum, and I’ll now describe how to solve this problem (or not) in various cases.

Retrying a stateless operation

In the most simplified case, the state doesn’t actually store anything (it’s an empty struct), and we have an operation on it that doesn’t take any additional parameter than a mutable reference to the state. We can therefore call the operation via a lambda |foo| foo.operation(), or directly by naming it Foo::operation.

struct Foo;

impl Foo {
    async fn operation(&mut self) -> bool {
        unimplemented!()
    }

    async fn indirect_1(&mut self) {
        self.retry_loop(Foo::operation).await
    }

    async fn indirect_2(&mut self) {
        self.retry_loop(|foo| foo.operation()).await
    }
}

A first naive implementation without lifetimes doesn’t work (Rust Playground).

    async fn retry_loop<F, Fut>(&mut self, mut f: F)
    where
        F: FnMut(&mut Self) -> Fut,
        Fut: Future<Output = bool>,
    {
        loop {
            if f(self).await {
                return;
            }
        }
    }

error: implementation of `FnOnce` is not general enough
   --> src/lib.rs:11:14
    |
11  |           self.retry_loop(Foo::operation).await
    |                ^^^^^^^^^^ implementation of `FnOnce` is not general enough
    |
    = note: `FnOnce<(&'0 mut Foo,)>` would have to be implemented for the type `for<'_> fn(&mut Foo) -> impl Future {Foo::operation}`, for some specific lifetime `'0`...
    = note: ...but `FnOnce<(&mut Foo,)>` is actually implemented for the type `for<'_> fn(&mut Foo) -> impl Future {Foo::operation}`

error: lifetime may not live long enough
  --> src/lib.rs:15:31
   |
15 |         self.retry_loop(|foo| foo.operation()).await
   |                          ---- ^^^^^^^^^^^^^^^ returning this value requires that `'1` must outlive `'2`
   |                          |  |
   |                          |  return type of closure is impl Future
   |                          has type `&'1 mut Foo`

The following attempt to add some lifetimes doesn’t work either (Rust Playground).

    async fn retry_loop<'a, 'b, F, Fut>(&'a mut self, mut f: F)
    where
        'a: 'b,
        F: FnMut(&'b mut Foo) -> Fut,
        Fut: Future<Output = bool> + 'b,
    {
        loop {
            if f(self).await {
                return;
            }
        }
    }

error[E0499]: cannot borrow `*self` as mutable more than once at a time
  --> src/lib.rs:25:18
   |
18 |     async fn retry_loop<'a, 'b, F, Fut>(&'a mut self, mut f: F)
   |                             -- lifetime `'b` defined here
...
25 |             if f(self).await {
   |                --^^^^-
   |                | |
   |                | mutable borrow starts here in previous iteration of loop
   |                argument requires that `*self` is borrowed for `'b`

The problem with this approach is that the lifetime 'b is fixed for the whole function, which restricts a bit what the borrow checker can do with it. As suggested by reading carefully this thread, one rather has to use a higher-ranked lifetime, i.e. generalizing for all possible lifetimes 'b.

In essence, we’d want to write the following code, but unfortunately using impl Future at this place isn’t supported by the compiler at the moment (Rust Playground).

    async fn retry_loop<'a, F, Fut>(&'a mut self, mut f: F)
    where
        for<'b> F: FnMut(&'b mut Foo) -> (impl Future<Output = bool> + 'b),
    {
        loop {
            if f(self).await {
                return;
            }
        }
    }

error[E0562]: `impl Trait` not allowed outside of function and inherent method return types
  --> src/lib.rs:20:43
   |
20 |         for<'b> F: FnMut(&'b mut Foo) -> (impl Future<Output = bool> + 'b),
   |                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error[E0698]: type inside `async fn` body must be known in this context
  --> src/lib.rs:11:14
   |
11 |         self.retry_loop(Foo::operation).await
   |              ^^^^^^^^^^ cannot infer type for type parameter `Fut` declared on the associated function `retry_loop`
   |
note: the type is part of the `async fn` body because of this `await`
  --> src/lib.rs:11:9
   |
11 |         self.retry_loop(Foo::operation).await
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

As mentioned in this reply, I ended up finding a solution from some other question. However, this solution requires the Rust nightly compiler (Rust Playground).

#![feature(type_alias_impl_trait)]

use std::future::Future;

type FutBool<'a> = impl 'a + Future<Output = bool>;

impl Foo {
    // ...

    async fn retry_loop<F>(&mut self, mut f: F)
    where
        for<'any> F: FnMut(&'any mut Foo) -> FutBool<'any>,
    {
        loop {
            if f(self).await {
                return;
            }
        }
    }
}

An interesting property from this solution is that it doesn’t work for the self.retry_loop(Foo::operation).await version.

error[E0271]: type mismatch resolving `for<'any> <for<'_> fn(&mut Foo) -> impl Future {Foo::operation} as FnOnce<(&'any mut Foo,)>>::Output == impl Future`
  --> src/lib.rs:15:14
   |
7  | type FutBool<'a> = impl 'a + Future<Output = bool>;
   |                    ------------------------------- the found opaque type
...
10 |     async fn operation(&mut self) -> bool {
   |                                      ---- the `Output` of this `async fn`'s expected opaque type
...
15 |         self.retry_loop(Foo::operation).await
   |              ^^^^^^^^^^ expected opaque type, found a different opaque type
   |
   = note: expected opaque type `impl Future` (opaque type at <src/lib.rs:10:38>)
              found opaque type `impl Future` (opaque type at <src/lib.rs:7:20>)
   = help: consider `await`ing on both `Future`s
   = note: distinct uses of `impl Trait` result in different opaque types

Also, I didn’t manage to generalize the FutBool type to use any output type instead of being specialized to bool (Rust Playground).

type Fut<'a, T> = impl 'a + Future<Output = T>;

error[E0271]: type mismatch resolving `<impl Future as Future>::Output == T`
 --> src/lib.rs:7:19
  |
7 | type Fut<'a, T> = impl 'a + Future<Output = T>;
  |              -    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected `bool`, found type parameter `T`
  |              |
  |              this type parameter
  |
  = note:        expected type `bool`
          found type parameter `T`

So the solution is still not perfect, but we have something working for this simple use case.

Retrying a stateful operation

In a more realistic scenario - such as in my actual code - the state is not empty. In particular, we can consider the case where the state is bound by a lifetime, due to referencing some other object, for example a string slice &str.

struct Bar<'a> {
    data: &'a str,
}

impl<'a> Bar<'a> {
    async fn operation(&mut self) -> bool {
        unimplemented!()
    }

    async fn indirect_1(&mut self) {
        self.retry_loop(Bar::operation).await
    }

    async fn indirect_2(&mut self) {
        self.retry_loop(|bar| bar.operation()).await
    }
}

If we try the solution that worked in the previous section, we now obtain compiler errors related to lifetimes (Rust Playground).

error[E0700]: hidden type for `impl Trait` captures lifetime that does not appear in bounds
  --> src/lib.rs:21:31
   |
21 |         self.retry_loop(|bar| bar.operation()).await
   |                               ^
   |
   = note: hidden type `impl Future` captures lifetime '_#14r

error[E0477]: the type `impl Future` does not fulfill the required lifetime
 --> src/lib.rs:9:20
  |
9 | type FutBool<'a> = impl 'a + Future<Output = bool>;
  |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
note: type must outlive the lifetime `'a` as defined on the item at 9:14
 --> src/lib.rs:9:14
  |
9 | type FutBool<'a> = impl 'a + Future<Output = bool>;
  |              ^^

It seems that the previous solution fails to capture the 'a lifetime of the Bar<'a> struct.

A somewhat different approach was suggested by steffahn in this reply. It doesn’t require any nightly feature, but instead one has to define some new trait specifically for this purpose.

trait AsyncFnMut<Arg>: FnMut(Arg) -> <Self as AsyncFnMut<Arg>>::Fut {
    type Fut: Future<Output=<Self as AsyncFnMut<Arg>>::Output>;
    type Output;
}

impl<Arg, F, Fut> AsyncFnMut<Arg> for F
where
    F: FnMut(Arg) -> Fut,
    Fut: Future,
{
    type Fut = Fut;
    type Output = Fut::Output;
}

impl<'a> Bar<'a> {
    // ...

    async fn retry_loop<F>(&mut self, mut f: F)
    where
        for<'any> F: AsyncFnMut<&'any mut Self, Output=bool>
    {
        loop {
            if f(self).await {
                return;
            }
        }
    }
}

Interestingly, this solution only works for the self.retry_loop(Bar::operation).await case (Rust Playground), which is exactly the opposite of the previous solution!

error[E0282]: type annotations needed
  --> src/lib.rs:31:31
   |
31 |         self.retry_loop(|bar| bar.operation()).await
   |                               ^^^ cannot infer type
   |
   = note: type must be known at this point

error: lifetime may not live long enough
  --> src/lib.rs:35:42
   |
35 |         self.retry_loop(|bar: &mut Self| bar.operation()).await
   |                               -        - ^^^^^^^^^^^^^^^ returning this value requires that `'1` must outlive `'2`
   |                               |        |
   |                               |        return type of closure is impl Future
   |                               let's call the lifetime of this reference `'1`

Based on this solution, I could still implement this commit to refactor one use case of my retry loop.

Retrying a stateful operation with parameters

The last case I want to discuss is the most general case that came up in my code. Here, the operation takes a parameter, that we therefore have to capture via a lambda.

struct Bar<'a> {
    data: &'a str,
}

impl<'a> Bar<'a> {
    async fn operation(&mut self, param: usize) -> bool {
        unimplemented!()
    }

    async fn indirect(&mut self, param: usize) {
        // Because we need to capture `param`, we can only call the retry loop
        // as follows.
        self.retry_loop(|bar| bar.operation(param)).await
    }
}

Because the state Bar depends on some lifetime 'a, and the retry_loop has to take a lambda function, neither of the two previous solutions works in this case.

So for now, this is still an open question (at least for me), but don’t hesitate to reply to my question on the forum if you have a solution!

Conclusion

What started as a simple project led me to explore the limits of Rust types in many aspects. My last problem involves:

lifetimes, the specificity of Rust as a programming language,
the impl Trait syntax, a powerful way of naming any type that implements a trait,
async functions, and in particular the Future trait,
higher-rank trait bounds,
the type_alias_impl_trait unstable feature.

It’s no wonder that mixing all of these together yields to difficult problems, and an open question. But it’s also quite amazing that a practical programming language like Rust allows one to (almost) express such complex types!

See Why async fn in traits are hard. ↩

Comments

To react to this blog post please check the Reddit thread and the Twitter thread.

RSS | Mastodon | GitHub