RFC 0236: Errors reform

lang | libs (panic | error-handling | convention)

Summary

This is a conventions RFC for formalizing the basic conventions around error handling in Rust libraries.

The high-level overview is:

This RFC follows up on two earlier attempts by giving more leeway in when to fail the task.

Motivation

Rust provides two basic strategies for dealing with errors:

However, while there have been some general trends in the usage of the two handling mechanisms, we need to have formal guidelines in order to ensure consistency as we stabilize library APIs. That is the purpose of this RFC.

For the most part, the RFC proposes guidelines that are already followed today, but it tries to motivate and clarify them.

Detailed design

Errors fall into one of three categories:

The basic principle of the conventions is that:

Catastrophic errors

An error is catastrophic if there is no meaningful way for the current task to continue after the error occurs.

Catastrophic errors are extremely rare, especially outside of libstd.

Canonical examples: out of memory, stack overflow.

For catastrophic errors, fail the task.

For errors like stack overflow, Rust currently aborts the process, but could in principle fail the task, which (in the best case) would allow reporting and recovery from a supervisory task.

Contract violations

An API may define a contract that goes beyond the type checking enforced by the compiler. For example, slices support an indexing operation, with the contract that the supplied index must be in bounds.

Contracts can be complex and involve more than a single function invocation. For example, the RefCell type requires that borrow_mut not be called until all existing borrows have been relinquished.

For contract violations, fail the task.

A contract violation is always a bug, and for bugs we follow the Erlang philosophy of "let it crash": we assume that software will have bugs, and we design coarse-grained task boundaries to report, and perhaps recover, from these bugs.

Contract design

One subtle aspect of these guidelines is that the contract for a function is chosen by an API designer -- and so the designer also determines what counts as a violation.

This RFC does not attempt to give hard-and-fast rules for designing contracts. However, here are some rough guidelines:

Obstructions

An operation is obstructed if it cannot be completed for some reason, even though the operation's contract has been satisfied. Obstructed operations may have (documented!) side effects -- they are not required to roll back after encountering an obstruction. However, they should leave the data structures in a "coherent" state (satisfying their invariants, continuing to guarantee safety, etc.).

Obstructions may involve external conditions (e.g., I/O), or they may involve aspects of the input that are not covered by the contract.

Canonical examples: file not found, parse error.

For obstructions, use Result

The Result<T,E> type represents either a success (yielding T) or failure (yielding E). By returning a Result, a function allows its clients to discover and react to obstructions in a fine-grained way.

What about Option?

The Option type should not be used for "obstructed" operations; it should only be used when a None return value could be considered a "successful" execution of the operation.

This is of course a somewhat subjective question, but a good litmus test is: would a reasonable client ever ignore the result? The Result type provides a lint that ensures the result is actually inspected, while Option does not, and this difference of behavior can help when deciding between the two types.

Another litmus test: can the operation be understood as asking a question (possibly with sideeffects)? Operations like pop on a vector can be viewed as asking for the contents of the first element, with the side effect of removing it if it exists -- with an Option return value.

Do not provide both Result and fail! variants.

An API should not provide both Result-producing and failing versions of an operation. It should provide just the Result version, allowing clients to use try! or unwrap instead as needed. This is part of the general pattern of cutting down on redundant variants by instead using method chaining.

There is one exception to this rule, however. Some APIs are strongly oriented around failure, in the sense that their functions/methods are explicitly intended as assertions. If there is no other way to check in advance for the validity of invoking an operation foo, however, the API may provide a foo_catch variant that returns a Result.

The main examples in libstd that currently provide both variants are:

(Note: it is unclear whether these APIs will continue to provide both variants.)

Drawbacks

The main drawbacks of this proposal are:

The alternatives mentioned below do not suffer from these problems, but have drawbacks of their own.

Alternatives

Two alternative designs have been given in earlier RFCs, both of which take a much harder line on using fail! (or, put differently, do not allow most functions to have contracts).

As was pointed out by @SiegeLord, however, mixing what might be seen as contract violations with obstructions can make it much more difficult to write obstruction-robust code; see the linked comment for more detail.

Naming

There are numerous possible suffixes for a Result-producing variant: