RFC 1977: public-private-dependencies

lang | tools (cargo | dependencies)

Summary

Introduce a public/private distinction to crate dependencies.

Motivation

The crates ecosystem has greatly expanded since Rust 1.0. With that, a few patterns for dependencies have evolved that challenge the currently existing dependency declaration system in Cargo and Rust. The most common problem is that a crate A depends on another crate B but some of the types from crate B are exposed through the API in crate A. This causes problems in practice if that dependency B is also used by the user's code itself, crate B resolves to different versions for each usage, and the values of types from the two crate B instances need to be used together but don't match. In this case, the user's code will refuse to compile because different versions of those libraries are requested, and the compiler messages are less than clear.

The introduction of an explicit distinction between public and private dependencies can solve some of these issues. This distinction should also let us lift some restrictions and make some code compile that previously was prevented from compiling.

Q: What is a public dependency?
A: A dependency is public if some of the types or traits of that dependency are themselves exported through the public API of main crate. The most common places where this happens are return values and function parameters. The same applies to trait implementations and many other things. Because "public" can be tricky to determine for a user, this RFC proposes to extend the compiler infrastructure to detect the concept of a "public dependency". This will help the user understand this concept so they can avoid making mistakes in the Cargo.toml.

Effectively, the idea is that if you bump a public dependency's version, it's a breaking change of your own crate.

Q: What is a private dependency?
A: On the other hand, a private dependency is contained within your crate and effectively invisible for users of your crate. As a result, private dependencies can be freely duplicated in the dependency graph and won't cause compilation errors. This distinction will also make it possible to relax some restrictions that currently exist in Cargo which sometimes prevent crates from compiling.

Q: Can public become private later?
A: Public dependencies are public within a reachable subgraph but can become private if a crate stops exposing a public dependency. For instance, it is very possible to have a family of crates that all depend on a utility crate that provides common types which is a public dependency for all of them. However, if your own crate ends up being a user of this utility crate but none of its types or traits become part of your own API, then this utility crate dependency is marked private.

Q: Where is public / private defined?
Dependencies are private by default and are made public through a public flag on the dependency in the Cargo.toml file. This also means that crates created before the implementation of this RFC will have all their dependencies private.

Q: How is backwards compatibility handled?
A: It will continue to be permissible to "leak" dependencies (and there are even some use cases of this), however, the compiler or Cargo will emit warnings if private dependencies are part of the public API. Later, it might even become invalid to publish new crates without explicitly silencing these warnings or marking the dependencies as public.

Q: Can I export a type from a private dependency as my own?
A: For now, it will not be strictly permissible to privately depend on a crate and export a type from there as your own. The reason for this is that at the moment it is not possible to force this type to be distinct. This means that users of the crate might accidentally start depending on that type to be compatible if the user starts to depend on the crate that actually implements that type. The limitations from the previous answer apply (e.g.: you can currently overrule the restrictions).

Q: How do semver and dependencies interact?
A: It is already the case that changing your own dependencies would require a semver bump for your own library because your API contract to the outside world changes. This RFC, however, makes it possible to only have this requirement for public dependencies and would permit Cargo to prevent new crate releases with semver violations.

Detailed design

There are a few areas that need to be changed for this RFC:

Compiler Changes

The main change to the compiler will be to accept a new parameter that Cargo supplies which is a list of public dependencies. The flag will be called --extern-public. The compiler then emits warnings if it encounters private dependencies leaking to the public API of a crate. cargo publish might change this warning into an error in its lint step.

Additionally, later on, the warning can turn into a hard error in general.

In some situations, it can be necessary to allow private dependencies to become part of the public API. In that case one can permit this with #[allow(external_private_dependency)]. This is particularly useful when paired with #[doc(hidden)] and other already existing hacks.

This most likely will also be necessary for the more complex relationship of libcore and libstd in Rust itself.

Changes to Cargo.toml

The Cargo.toml file will be amended to support the new public parameter on dependencies. Old Cargo versions will emit a warning when this key is encountered but otherwise continue. Since the default for a dependency to be private only, public ones will need to be tagged which should be the minority.

This will start as an unstable Cargo feature available on nightly only that authors will need to opt into via a feature specified in Cargo.toml before Cargo will start using the public attribute to change the way versions are resolved. The Cargo unstable feature will turn on a corresponding rustc unstable feature for the compiler changes noted above.

Example dependency:

[dependencies]
url = { version = "1.4.0", public = true }

Changes to the Cargo Index

The Cargo index used by Cargo when resolving versions will contain the public attribute on dependencies as specified in Cargo.toml. For example, an index line for a crate named example that publicly depends on the url crate would look like (JSON prettified for legibility):

{
    "name":"example",
    "vers":"0.1.0",
    "deps":[
        {
            "name":"url",
            "req":"^1.4.0",
            "public":"true",
            "features":[],
            "optional":false,
            "default_features":true,
            "target":null,
            "kind":"normal"
        }
    ]
}

Changes to Cargo Version Resolution

Cargo will specifically reject graphs that contain two different versions of the same crate being publicly depended upon and reachable from each other. This will prevent the strange errors possible today at version resolution time rather than at compile time.

How this will work:

Changes to Cargo Publish: Warnings

When a new crate version is published, Cargo will warn about types and traits that the compiler determined to be public but did not come from a public dependency. For now, it should be possible to publish anyways but in some period in the future it will be necessary to explicitly mark all public dependencies as such or explicitly mark them with #[allow(external_private_dependency)].

Changes to Cargo Publish: Lowest Version Resolution

A very common situation today is that people write the initial version of a dependency in their Cargo.toml, but never bother to update it as they take advantage of new features in newer versions. This works out okay because (1) Cargo will generally use the largest version it can find, compatible with constraints, and (2) upper bounds on constraints (at least within a particular minor version) are relatively rare. That means, in particular, that Cargo.toml is not a fully accurate picture of version dependency information; in general it's a lower bound at best. There can be "invisible" dependencies that don't cause resolution failures but can create compilation errors as APIs evolve.

Public dependencies exacerbate the above problem, because you can end up relying on features of a "new API" from a crate you didn't even know you depended on! For example:

This example and others like it rely on a common ingredient: a crate somewhere using an API that only is available in a newer version of a crate than the version listed in Cargo.toml.

To attempt to surface this problem earlier, cargo publish will attempt to resolve the graph while picking the smallest versions compatible with constraints. If the crate fails to build with this resolution graph, the publish will fail.

How We Teach This

From the user's perspective, the initial scope of the RFC will be quite transparent, but it will definitely show up for users as a question of what the new restrictions mean. In particular, a common way to leak out types from APIs that most crates do is error handling. Quite frequently it happens that users wrap errors from other libraries in their own types. It might make sense to identify common cases of where type leakage happens and provide hints in the lint about how to deal with it.

Cases that I anticipate that should be explained separately:

The feature will be called public_private_dependencies and it comes with one lint flag called external_private_dependency. For all intents and purposes, this should be the extent of the new terms introduced in the beginning. This RFC, however, lays the groundwork for later providing aliasing so that a private dependency could be forcefully re-exported as the crate's own types. As such, it might make sense to consider how to refer to this.

It is assumed that this feature will eventually become quite popular due to patterns that already exist in the crate ecosystem. It's likely that it will evoke some negative opinions initially. As such, it would be a good idea to make a run with cargobomb/crater to see what the actual impact of the new linter warnings is and how far away we are from making them errors.

Crates.io should be updated to render public and private dependencies separately.

End user experience

Author of a crate with one dependency

Assume today that an author of a library crate onedep has a dependency on the url crate and the url::Url type is exposed in onedep's public API.

onedep's Cargo.toml:

[package]
name = "onedep"
version = "0.1.0"

[dependencies]
url = "1.0.0"

onedep's src/lib.rs:

extern crate url;
use url::Origin;

use std::collections::HashMap;

#[derive(Default)]
pub struct OriginTracker {
    origin_counts: HashMap<Origin, usize>,
}

impl OriginTracker {
    pub fn log_origin(&mut self, origin: Origin) {
        let counter = self.origin_counts.entry(origin).or_insert(0);
        *counter += 1;
    }
}

When the author of onedep upgrades Rust/Cargo to a version where this RFC is completely implemented, the author will notice two changes:

  1. When they run cargo build, the build will succeed but they will get a warning that a private dependency (the url crate specifically) is used in their public API (the url::Origin type in the pub fn log_origin function specifically) and that they should consider adding public = true to their Cargo.toml. Ideally the warning would say something like:

        consider changing dependency:
    
        ```
        url = "1.0.0"
        ```
    
        to:
    
        ```
        url = { version = "1.0.0", public = true }
        ```
    

The warning could also encourage the author to then bump their crate's major version since adding public dependencies is a breaking change.

  1. When they run cargo publish, the build check that happens after packaging will fail and the publish will fail. This is because deriving Hash on url::Origin wasn't added until v1.5.1 of the url crate. The author of onedep has been running cargo update periodically, and their Cargo.lock has url 1.5.1, but they never updated Cargo.toml to indicate that they have a new lower bound. Since cargo publish will try to resolve dependencies to the lowest possible versions, it will choose version 1.0.0 of the url crate, which doesn't implement Hash on Origin.

There should be a clear error message for this case that indicates Cargo has resolved crates to their lowest possible versions, that this might be the cause of the compilation failure, and that the author should investigate the versions of their dependencies in Cargo.toml to see if they should be updated. This command should change the Cargo.lock so that running cargo build will reproduce the error for the author to fix.

Author of a crate with multiple dependencies

twodep's Cargo.toml:

[package]
name = "twodep"
version = "0.1.0"

[dependencies]
// this is the version of onedep above using a public dep on url 1.5.1
onedep = "1.0.0"
url = "1.0.0"

twodep's src/main.rs:

extern crate url;
use url::Origin;

extern crate onedep;

fn main() {
    let mut origin_tracker = onedep::OriginTracker::default();

    loop {
        println!("Please enter a URL!");
        // pseudocode because I'm lazy
        let url = stdin::readline().unwrap();
        let url = Url::parse(url).unwrap();
        origin_tracker.log_origin(url.origin());
        // other stuff
    }
    println!("Here are all the origins you mentioned: {:#?}", origin_tracker);
}

Before upgrading Rust/Cargo to a version where this RFC has been implemented, this code might have been getting a compilation error if Cargo had resolved the direct dependency on the url crate to a different version than the version of onedep resolved to. Or it might have been resolving and compiling fine if the versions had resolved to be the same.

After upgrading Rust/Cargo, if this code had a compilation error, it would now have a version resolution problem that cargo would either automatically resolve or prompt the user to change version constraints/cargo update to resolve. If the code was compiling before, that must mean the previous resolution graph was good, so nothing will change on upgrading.

This crate is a binary and doesn't have a public API, so it won't get any warnings about crates not being marked public.

If the author publishes to crates.io after upgrading Rust/Cargo, since onedep's public dependency on url now has a lower bound of 1.5.1, the only valid graphs that Cargo will generate will be with url 1.5.1 or greater, which is also compatible with the url 1.0.0 direct dependency. Publish will work without any errors or further changes.

Drawbacks

I believe that there are no drawbacks if implemented well (this assumes good linters and error messages).

Alternatives

For me, the biggest alternative to this RFC would be a variation of it where type and trait aliasing becomes immediately part of it. This would mean that a crate can have a private dependency and re-export it as its own type, hiding where it came from originally. This would most likely be easier to teach users and can get rid of a few "cul-de-sac" situations users can end up in where their only way out is to introduce a public dependency for now. The assumption is that if trait and type aliasing is available, the external_public_dependency would not need to exist.

Unresolved questions

There are a few open questions about how to best hook into the compiler and Cargo infrastructure: