19 February 2022

We need to talk about RFCs

I think the Rust RFC process needs serious reform. In this blog post, I'll explain why I think that, by covering some of the problems with the current process.

Before I get all negative, I think RFCs are amazing! They've been a crucial part of Rust's open design and community-focussed governance. In any reform we should make sure we preserve the benefits and don't throw out too much. In any case, we can't expect to fix everything - there are many trade-offs and we might choose to leave something unfixed since fixing it would make the whole process worse in some other way.

I also don't think we should rush into reforming the process. In particular, there is ongoing work looking at Rust's governance and leadership, and I think looking at the RFC process in parallel with that would be too much. Therefore, I'm not going to look at any possible solutions here, nor do I want to start RFC reform now. However, I think it is worth highlighting that there are problems with the process and identifying what those problems are, since that might reasonably influence the governance discussion.

What are RFCs for?

The "RFC" (request for comments) process is intended to provide a consistent and controlled path for new features to enter the language and standard libraries, so that all stakeholders can be confident about the direction the language is evolving in.

From the repo README.

Rust's RFC process is useful for doing design in the open, including advance planning, up-front design, and incremental/evolutionary design. The RFC process also serves as the primary channel for communicating changes to the community, as well as getting feedback from the community on new features and their design.

The RFC process is not the only tool for these tasks. Design takes place on GitHub (in the Rust repo and elsewhere), on Zulip, and in synchronous meetings. Communication happens via official and unofficial blogs, documentation, and forums like internals.rust-lang.org. Feedback is also gathered via these forums, from the annual survey, and from various ad hoc conversations. Decision making can happen in GitHub issues (via FCP or simple review), or in the major change proposal (MCP) process. However, since the RFC process is the primary official space for decision making, it has a special place in the wider toolbox for all these tasks.

In practice, the RFC process has become the default tool for solving problems which need some level of design discussion, community feedback, or decision making.

In contrast to the MCP Process, the RFC process applies to the whole Rust project, rather than just some teams. It is more widely known, and is the superior process (proposals which are too 'big' for MCPs are upgraded to RFCs).

RFCs were first used for language changes, and there are more language RFCs than for any other area of the project. RFCs are used for technical changes of all kinds, process and governance changes, and planning for roadmaps, editions, etc. However, the technical heritage (in particular for language changes) is still apparent, and applying the RFC process to these areas feels like something of a hack.

Audience

RFCs exist initially as an evolving proposal (pull request) with accompanying discussion in the form of GitHub comments and reviews. Once accepted, an RFC becomes a mostly-immutable design artefact.

Anyone with a GitHub account (and who hasn't been banned from the project, which is a tiny number of people) can post an RFC or join in an RFC discussion. Anyone can read an accepted RFC or an RFC discussion (either in-progress or after the RFC is accepted or closed).

There are several different groups of people involved in an RFC:

The authors (often just one person, often part of the responsible team), writes the RFC, updates it to take into account discussion, engages with discussion.
The responsible team (e.g., the language team; maybe more than one team for RFCs which cut across domains), responsible for a yes/no decision on the RFC and for requesting any blocking changes. The team must consider the high-level view (do we want this new feature at all?) and the details (is this how the feature should be implemented?).
Interested community members, who may be potential users of a feature, or just interested in Rust's development. Offer feedback, engage in discussion, or silently observe.
Implementers (often part of the responsible team or mentored by someone on that team), use accepted RFCs and their discussion to guide implementation.
Rust users or others interested in the language use accepted RFCs as documentation or specification.

Effort and noise

One major issue with RFCs is that they feel like a lot of effort for a lot of people. They are a lot of effort to write (arguably this is a good thing since it forces authors to pay attention to up-front design, but even so, writing an RFC feels like more of a burden than it should), they are a lot effort for teams to manage, they are a lot of effort to read in detail, and it takes a lot of effort to keep up with discussion.

The only action which isn't a lot of effort is leaving a comment (especially compared to thoroughly reading the RFC, spending time to understand it and its motivations, and leaving a high quality comment), so these are sometimes low value.

Even with just the high-value comments (and one of the great things about the RFC process is that we do get a lot of high-value comments), the sheer volume of comments can be overwhelming. Not just because there is a lot to read, but because they are in chronological order and unorganised. This issue compounds with time, since once discussion gets long, it becomes hard work to read the whole discussion (made worse by GitHub's UI) and so duplication increases and the discussion gets noisier and harder to follow.

The result of these pressures is that the RFC process is very slow. It takes a long time to write an RFC, the discussion takes a long time, and decision making takes a long time.

The burden on maintainers is very high. Having to triage, discuss, and make decisions on RFCs takes time and effort from maintainers, whose time and effort is usually the bottleneck in any development on the Rust project.

Structure

One issue is that there is one rigid template (though departures from the template are not aggressively policed) for all RFCs, independent of the domain or the magnitude of the change. A consequence of this is that sections which are not appropriate for an RFC are often just fluff, and RFCs overall feel a bit heavy on boilerplate compared to less formal communication.

RFCs focus on describing changes in terms of documentation, which may be sub-optimal for changes which are simple from a user's perspective but require a complex implementation. It can also make RFCs read badly since large parts are written for an audience (users) who are not in fact the audience reading the RFC (designers, implementers, decision makers, users with a different level of sophistication than users who will read the eventual documentation).

There is some ambiguity as to whether RFCs should describe the end result of the change or the difference compared to today's Rust. Both are important, but for different purposes. The emphasis on documentation favours describing the end result, but implementers and decision makers often want to understand the change being proposed.

The structure of the process favours a static view of designs, but features often evolve during implementation and stabilisation. There is no explicit process for updating RFCs and tracking the status of RFCs is done outside the RFC repo (with no updates back to the RFC repo). Updates to RFCs are ad hoc - sometimes RFCs are updated, sometimes a new RFC is created, sometimes no update happens. It is impossible to tell from reading an accepted RFC whether it was implemented or stabilised, and if it was, whether the implementation matched the RFC or if it changed, or if the RFC was superseded by another.

Another structural issue with the RFC process is that there is no backpressure mechanism. The rate of creation of new RFCs is not controlled and is unpredictable. Furthermore, there is no explicit prioritisation of RFCs (RFCs are implicitly prioritised by teams, but that is not visible and may not even be discussed explicitly by the teams but emerges instead as a consequence of individual's priorities).

Incentives

Many of the issues with RFCs are not directly due to the process, but due to the incentives it creates. Mostly it is the above factors of high effort which cause these incentives, often in concert with Rust's consensus-building model of decision making, and general risk aversion. (To be clear, I don't think consensus-based decision making or risk aversion are bad per se, but they do create some bad incentive effects).

There is an incentive to make RFCs as small as possible since the smaller the RFC is, the easier it is to get accepted. This could be good in that it favours breaking up design work into small chunks and iterative development. However, in practice it can make proposals across multiple RFCs difficult to follow, repeating discussion across RFCs, and unfinished features where the initial work is done but follow-up work is not tracked and falls off the radar (or similarly, 80/20 solutions which address the main use cases, but in aggregate leave the language feeling rough around the edges and full of ergonomic or expressivity cliffs).

On the other hand, there is also an incentive to be over-rigorous and do too much up-front design rather than do iterative development. This is in part because the RFC process assumes RFCs stand alone and there is poor support for tracking development. It is also due in part to a ratchet effect in discussions where it is much easier to ask for more work rather than less from RFC authors (e.g., more technical rigour c.f., experimentation, more up-front design c.f., iterative development, more explanation or documentation c.f., brevity, etc.).

There is an incentive to over-use RFCs where they are sub-optimal. Ironically, there are alternative mechanisms (MCPs, shifting decision making to stabilisation, etc.) for areas like language and library design where RFCs are most appropriate. However, for areas such as governance which RFCs were not designed for, RFCs are often the only option. Additionally, since RFCs are the backstop solution for decision making where decision making is difficult, there is an incentive to always just ask for an RFC, even where a more lightweight process is more appropriate.

Since RFCs are strongly tied to teams, there is a tendency to over-focus on a single domain of solutions and not to consider a global perspective for the project. RFCs owned by multiple teams are even more effort than sole-team RFCs, so there is an incentive to avoid cross-domain solutions. Considering the effect of an RFC outside the domain of the owning team is incentivised against.

There are strong incentives against closing RFCs, both because of the culture of the Rust project (a desire to be welcoming and 'nice' to contributors, a desire to welcome design from anyone irrespective of de jure status, etc.), and due to the high effort of writing an RFC. This manifests as reluctance to close RFCs on initial triage with little or no discussion, and a reluctance to close RFCs once discussion has stalled or reached a natural conclusion (where there is the additional sunk cost of the effort expended in discussion). Furthermore, it is difficult to make any progress (positive or negative) on older, stalled RFC discussions. As a data point, there are 54 currently open RFCs from 2020 or older (i.e., more than a year old).

There is an incentive to postpone joining the discussion. This often manifests as key decision makers not commenting on RFCs until FCP. The discussions therefore often follow a pattern of an initial period of discussion which is high-volume, but low-value, followed by a long period of stalled discussion (which should indicate stability of the RFC), then (possibly with an FCP announcement) another period of discussion with substantial change requests. This can be frustrating for RFC authors who (reasonably) expect that the initial round of discussion is the important one for iterating on their proposal.

Finally, there is an incentive for collaborative design to happen outside the RFC process. The RFC process is intended to foster collaborative design, however, in practice productive collaboration occurs in working groups (official or ad hoc), on internals.rust-lang.org, or in ecosystem crates. When an RFC is actually submitted as a PR, there is an expectation that most design work is done and the discussion mostly focuses on polishing details (or from a more cynical perspective, bike-shedding). That means that community input does not happen in a truly open way and relies on knowing the 'right people' or at least the right venues, which are not widely promoted.

Stabilisation issues

Stabilization is a key part of the process for changes in Rust. Once a change is stabilised then it is fully accepted into Rust and (thanks to our strong backwards compatibility promise) will be around forever. Since designs can (and often do) change during implementation and as we get experience using the change in real life, the feature being stabilised is often very different from the one accepted as an RFC. Therefore stabilisation decisions are both important and complex.

Furthermore, some teams have moved to a decision making model where changes are accepted without an RFC or where there are significant open questions in the RFC, relying on the stabilisation process to ensure quality.

However, stabilisation tends to happen as a team vote in an issue in the main Rust repo (not the RFC repo), so it is much less visible and does not actively invite community participation. In fact, it is far more likely to only involve the team responsible and not get attention from other teams or users. In other words, stabilisation is a key part of the RFC process in a moral sense, but happens entirely outside the RFC process in a technical sense.

Summary

This is a long post, so to summarise:

RFCs are the principal vehicle for technical and non-technical decision making, discussion, and communication in the Rust project.
They are pretty amazing as long-term artefacts and for getting community input into design.
However, there are many issues due to RFC's one-size-fits-all nature (despite multiple uses, domains, and audiences), a high-effort process, the lack of process for follow-up, the structure of RFC documents, and cultural factors around the process (openness, technical rigour, consensus-based decision making, each of which is positive in isolation).
There are also many 'second order' issues due to the incentives caused by those factors.
The stabilisation process also has issues: primarily that it is an important part of the overall process for change, but is handled more casually than the RFC part of that overall process.