RFC 0141: lifetime-elision

lang (typesystem | inference | lifetimes)

Summary

This RFC proposes to

  1. Expand the rules for eliding lifetimes in fn definitions, and
  2. Follow the same rules in impl headers.

By doing so, we can avoid writing lifetime annotations ~87% of the time that they are currently required, based on a survey of the standard library.

Motivation

In today's Rust, lifetime annotations make code more verbose, both for methods

fn get_mut<'a>(&'a mut self) -> &'a mut T

and for impl blocks:

impl<'a> Reader for BufReader<'a> { ... }

In the vast majority of cases, however, the lifetimes follow a very simple pattern.

By codifying this pattern into simple rules for filling in elided lifetimes, we can avoid writing any lifetimes in ~87% of the cases where they are currently required.

Doing so is a clear ergonomic win.

Detailed design

Today's lifetime elision rules

Rust currently supports eliding lifetimes in functions, so that you can write

fn print(s: &str);
fn get_str() -> &str;

instead of

fn print<'a>(s: &'a str);
fn get_str<'a>() -> &'a str;

The elision rules work well for functions that consume references, but not for functions that produce them. The get_str signature above, for example, promises to produce a string slice that lives arbitrarily long, and is either incorrect or should be replaced by

fn get_str() -> &'static str;

Returning 'static is relatively rare, and it has been proposed to make leaving off the lifetime in output position an error for this reason.

Moreover, lifetimes cannot be elided in impl headers.

The proposed rules

Overview

This RFC proposes two changes to the lifetime elision rules:

  1. Since eliding a lifetime in output position is usually wrong or undesirable under today's elision rules, interpret it in a different and more useful way.

  2. Interpret elided lifetimes for impl headers analogously to fn definitions.

Lifetime positions

A lifetime position is anywhere you can write a lifetime in a type:

&'a T
&'a mut T
T<'a>

As with today's Rust, the proposed elision rules do not distinguish between different lifetime positions. For example, both &str and Ref<uint> have elided a single lifetime.

Lifetime positions can appear as either "input" or "output":

The rules

Notice that the actual signature of a fn or impl is based on the expansion rules above; the elided form is just a shorthand.

Examples

fn print(s: &str);                                      // elided
fn print<'a>(s: &'a str);                               // expanded

fn debug(lvl: uint, s: &str);                           // elided
fn debug<'a>(lvl: uint, s: &'a str);                    // expanded

fn substr(s: &str, until: uint) -> &str;                // elided
fn substr<'a>(s: &'a str, until: uint) -> &'a str;      // expanded

fn get_str() -> &str;                                   // ILLEGAL

fn frob(s: &str, t: &str) -> &str;                      // ILLEGAL

fn get_mut(&mut self) -> &mut T;                        // elided
fn get_mut<'a>(&'a mut self) -> &'a mut T;              // expanded

fn args<T:ToCStr>(&mut self, args: &[T]) -> &mut Command                  // elided
fn args<'a, 'b, T:ToCStr>(&'a mut self, args: &'b [T]) -> &'a mut Command // expanded

fn new(buf: &mut [u8]) -> BufWriter;                    // elided
fn new<'a>(buf: &'a mut [u8]) -> BufWriter<'a>          // expanded

impl Reader for BufReader { ... }                       // elided
impl<'a> Reader for BufReader<'a> { .. }                // expanded

impl Reader for (&str, &str) { ... }                    // elided
impl<'a, 'b> Reader for (&'a str, &'b str) { ... }      // expanded

impl StrSlice for &str { ... }                          // elided
impl<'a> StrSlice<'a> for &'a str { ... }               // expanded

trait Bar<'a> { fn bound(&'a self) -> &int { ... }    fn fresh(&self) -> &int { ... } }           // elided
trait Bar<'a> { fn bound(&'a self) -> &'a int { ... } fn fresh<'b>(&'b self) -> &'b int { ... } } // expanded

impl<'a> Bar<'a> for &'a str {
  fn bound(&'a self) -> &'a int { ... } fn fresh(&self) -> &int { ... }              // elided
}
impl<'a> Bar<'a> for &'a str {
  fn bound(&'a self) -> &'a int { ... } fn fresh<'b>(&'b self) -> &'b int { ... }    // expanded
}

// Note that when the impl reuses the same signature (with the same elisions)
// from the trait definition, the expanded forms will also match, and thus
// the `impl` will be compatible with the `trait`.

impl Bar for &str            { fn bound(&self) -> &int { ... } }           // elided
impl<'a> Bar<'a> for &'a str { fn bound<'b>(&'b self) -> &'b int { ... } } // expanded

// Note that the preceding example's expanded methods do not match the
// signatures from the above trait definition for `Bar`; in the general
// case, if the elided signatures between the `impl` and the `trait` do
// not match, an expanded `impl` may not be compatible with the given
// `trait` (and thus would not compile).

impl Bar for &str            { fn fresh(&self) -> &int { ... } }           // elided
impl<'a> Bar<'a> for &'a str { fn fresh<'b>(&'b self) -> &'b int { ... } } // expanded

impl Bar for &str {
  fn bound(&'a self) -> &'a int { ... } fn fresh(&self) -> &int { ... }    // ILLEGAL: unbound 'a
}

Error messages

Since the shorthand described above should eliminate most uses of explicit lifetimes, there is a potential "cliff". When a programmer first encounters a situation that requires explicit annotations, it is important that the compiler gently guide them toward the concept of lifetimes.

An error can arise with the above shorthand only when the program elides an output lifetime and neither of the rules can determine how to annotate it.

For fn

The error message should guide the programmer toward the concept of lifetime by talking about borrowed values:

This function's return type contains a borrowed value, but the signature does not say which parameter it is borrowed from. It could be one of a, b, or c. Mark the input parameter it borrows from using lifetimes, e.g. [generated example]. See [url] for an introduction to lifetimes.

This message is slightly inaccurate, since the presence of a lifetime parameter does not necessarily imply the presence of a borrowed value, but there are no known use-cases of phantom lifetime parameters.

For impl

The error case on impl is exceedingly rare: it requires (1) that the impl is for a trait with a lifetime argument, which is uncommon, and (2) that the Self type has multiple lifetime arguments.

Since there are no clear "borrowed values" for an impl, this error message speaks directly in terms of lifetimes. This choice seems warranted given that a programmer implementing a trait with lifetime parameters will almost certainly already understand lifetimes.

TraitName requires lifetime arguments, and the impl does not say which lifetime parameters of TypeName to use. Mark the parameters explicitly, e.g. [generated example]. See [url] for an introduction to lifetimes.

The impact

To assess the value of the proposed rules, we conducted a survey of the code defined in libstd (as opposed to the code it reexports). This corpus is large and central enough to be representative, but small enough to easily analyze.

We found that of the 169 lifetimes that currently require annotation for libstd, 147 would be elidable under the new rules, or 87%.

Note: this percentage does not include the large number of lifetimes that are already elided with today's rules.

The detailed data is available at: https://gist.github.com/aturon/da49a6d00099fdb0e861

Drawbacks

Learning lifetimes

The main drawback of this change is pedagogical. If lifetime annotations are rarely used, newcomers may encounter error messages about lifetimes long before encountering lifetimes in signatures, which may be confusing. Counterpoints:

Programmers learn lifetimes once, but will use them many times. Better to favor long-term ergonomics, if a simple elision rule can cover 87% of current lifetime uses (let alone the currently elided cases).

Subtlety for non-& types

While the rules are quite simple and regular, they can be subtle when applied to types with lifetime positions. To determine whether the signature

fn foo(r: Bar) -> Bar

is actually using lifetimes via the elision rules, you have to know whether Bar has a lifetime parameter. But this subtlety already exists with the current elision rules. The benefit is that library types like Ref<'a, T> get the same status and ergonomics as built-ins like &'a T.

Alternatives

Unresolved questions

The fn and impl cases tackled above offer the biggest bang for the buck for lifetime elision. But we may eventually want to consider other opportunities.

Double lifetimes

Another pattern that sometimes arises is types like &'a Foo<'a>. We could consider an additional elision rule that expands &Foo to &'a Foo<'a>.

However, such a rule could be easily added later, and it is unclear how common the pattern is, so it seems best to leave that for a later RFC.

Lifetime elision in structs

We may want to allow lifetime elision in structs, but the cost/benefit analysis is much less clear. In particular, it could require chasing an arbitrary number of (potentially private) struct fields to discover the source of a lifetime parameter for a struct. There are also some good reasons to treat elided lifetimes in structs as 'static.

Again, since shorthand can be added backwards-compatibly, it seems best to wait.