RFC 3028: Cargo: binary dependencies

tools (cargo)

Summary

Allow Cargo packages to depend on bin, cdylib, and staticlib crates, and use the artifacts built by those crates.

Motivation

There are many different possible use cases.

Guide-level explanation

Cargo allows you to depend on binary or C ABI artifacts of another package; this is known as a "binary dependency" or "artifact dependency". For example, you can depend on the cmake binary in your build.rs like this:

[build-dependencies]
cmake = { version = "1.0", artifact = "bin" }

Cargo will build the cmake binary, then make it available to your build.rs through an environment variable:

// build.rs
use std::{env, process::Command};

fn main() {
    let cmake_path = env::var_os("CARGO_BIN_FILE_CMAKE_cmake").expect("cmake binary");
    let mut cmake = Command::new(cmake_path).arg("--version");
    assert!(cmake.status().expect("cmake --version failed").success());
}

If you need to depend on multiple types of artifacts from a crate, such as both a binary and a cdylib from of a crate, you can supply an array of strings for artifact: artifact = ["bin", "cdylib"].

You can optionally depend on specific binary artifacts from a crate using bin:name:

[build-dependencies]
somedep = { version = "1.0", artifact = ["bin:somebinary", "bin:anotherbinary"] }

If no binaries are specified, all the binaries in the package will be built and made available.

You can obtain the directory containing all binaries built by the cmake crate with CARGO_BIN_DIR_CMAKE, such as to add it to $PATH before invoking another build system or a script.

Cargo also allows depending on cdylib or staticlib artifacts. For example, you can embed a dynamic library in your binary:

[dependencies]
mypreload = { version = "1.2.3", artifact = "cdylib" }
// main.rs
const MY_PRELOAD_LIB: &[u8] = include_bytes!(env!("CARGO_CDYLIB_FILE_MYPRELOAD"));

Note that cargo only supplies these dependencies when building your crate. If your program or library requires artifacts at runtime, you will still need to handle that yourself by some other means. Runtime requirements for installed crates are out of scope for this change.

By default, a dependency with artifact specified will serve only as an artifact dependency, and will not serve as a normal Rust dependency, even if the dependency normally supplies a Rust library. If you need to depend on artifacts from a crate, and also express a normal Rust dependency on the same crate, you can add lib = true to the dependency; for instance: cratename = { version = "1.2.3", lib = true, artifact = "bin" }. (This applies to Rust lib, rlib, or proc-macro crates, all of which use the same lib = true option.)

Reference-level explanation

There are three valid values for artifact available:

  1. "bin", a compiled binary, corresponding to a [[bin]] section in the dependency's manifest.
  2. "cdylib", a C-compatible dynamic library, corresponding to a [lib] section with crate-type = "cdylib" in the dependency's manifest.
  3. "staticlib", a C-compatible static library, corresponding to a [lib] section with crate-type = "staticlib" in the dependency's manifest.

"lib" corresponds to all crates that can be depended on currently, including lib, rlib, and proc-macro libraries. See linkage for more information.

Artifact dependencies can appear in any of the three sections of dependencies (or in target-specific versions of these sections):

By default, build-dependencies are built for the host, while dependencies and dev-dependencies are built for the target. You can specify the target attribute to build for a specific target, such as target = "wasm32-wasi"; a literal target = "target" will build for the target even if specifing a build dependency. (If the target is not available, this will result in an error at build time, just as if building the specified crate with a --target option for an unavailable target.)

Cargo provides the following environment variables to the crate being built:

For each kind of dependency, these variables are supplied to the same part of the build process that has access to that kind of dependency:

(See the "Future possibilities" section for a note about the use of env!.)

Similar to features, if other crates in your dependencies also depend on the same binary crate, and request different binaries, Cargo will build the union of all binaries requested.

Cargo will unify features and versions across all kinds of dependencies, including artifact dependencies, just as it does for multiple dependencies on the same crate throughout a dependency tree.

artifact may be a string, or a list of strings; in the latter case, this specifies a dependency on the crate with each of those artifact types, and is equivalent to specifying multiple dependencies with different artifact values. For instance, you may specify a build dependency on both a binary and a cdylib from the same crate. You may also specify separate dependencies with different artifact values, as well as dependencies on the same crate without artifact specified; for instance, you may have a build dependency on the binary of a crate and a normal dependency on the Rust library of the same crate.

Cargo does not take the specified artifact values into account when resolving a crate's version; it will resolve the version as normal, and then produce an error if that version does not support all the specified artifact values. Similarly, Cargo will produce an error if that version does not build all the binary artifacts required by "bin:name" values. Removing a crate type or an artifact is a semver-incompatible change. (Any further semver requirements on the interface provided by a binary or library depend on the nature of the binary or library in question.)

As with other kinds of dependencies, you can specify profile settings used to build artifact dependencies using overrides. If not overridden, artifact dependencies in build-dependencies compiled for the host will build using the build-override settings, and all other artifact dependencies will inherit the same profile settings being used to build the crate depending on them.

Until this feature is stabilized, it will require specifying the nightly-only option -Z bindeps to cargo. If cargo encounters an artifact dependency and does not have this option specified, it will emit an error and immediately stop building.

The placement of artifact directories is an implementation detail of Cargo, and subject to change. The proposed implementation will place the artifact directory for each crate in target/<TARGET>/artifact/<CRATE_NAME>-<METADATA_HASH>/<ARTIFACT_TYPE>, where <TARGET> is the target triple the artifact dependency is built for (which may be the target triple of the host), <CRATE_NAME> is the name of the crate, <METADATA_HASH> is the usual hash that Cargo appends to crate-related file and directory names to ensure that changing properties (such as features) that affect the build of the crate will build into different paths, and <ARTIFACT_TYPE> is the artifact type (bin, cdylib, or staticlib).

If Cargo needs to build a crate for multiple targets, and that crate has an artifact dependency with target="target", Cargo will build the artifact dependency for each target and supply it to the corresponding build of the depending crate.

Drawbacks

Some of the motivating use cases have alternative solutions, such as extracting a library from a tool written in Rust, and making the tool a thin wrapper around the library. Making this change may potentially reduce the motivation to extract such libraries. However, many of the other use cases do not currently have any solutions available (other than using an alternative build system, per the alternatives section), and extracted libraries have additional value even after this feature becomes available, so we don't see this as a reason to avoid introducing this feature.

Adding this feature will make Cargo usable for many more use cases, which may motivate people to use Cargo in more places and stretch it even further; this may, in turn, generate more support and more feature requests.

Rationale and alternatives

This RFC teaches Cargo to understand artifact dependencies. As an alternative, people writing crates with artifact dependencies could invoke cargo from build.rs, or could wrap the entire build in a separate build system that invokes Cargo multiple times. This would have many drawbacks, including:

This RFC proposes supplying both the root directory and the path to each specific artifact. The path to specific artifacts is useful for accessing that specific artifact, and avoids needing target-specific knowledge about the names of executables (.exe) or libraries (lib*.so, *.dll, ...). The root directory is useful for $PATH, $LD_LIBRARY_PATH, and similar. Going from one to the other requires making assumptions. We believe there's value in supplying both.

We could specify a target = "host" value to build for the host even for [dependencies] or [dev-dependencies] which would normally default to building for the target. If any use case arises for such a dependency, we can easily add that.

We could make information about artifact dependencies in [dependencies] available to the build.rs script, which would allow running arbitrary Rust code to work with such dependencies at build time (rather than being limited to env!, proc macros, and constant evaluation). However, we can achieve the same effect with an entry in [build-dependencies] that has target = "target", and that model seems simpler to explain and to work with.

We could install all binaries into a common binary directory with a well-known path under $OUT_DIR, and expect crates to use that directory, rather than passing in paths via environment variables. npm takes an approach like this. However, this would not allow dependencies on multiple distinct binaries with the same name, either provided by different crates or provided by the same crate built for different targets. Hardcoded paths would also reduce the flexibility of Cargo to change these paths in the future, such as to accommodate new features or extensions.

This RFC does not preclude future support in Cargo for more "native" handling of cdylib/staticlib dependencies, if Cargo can provide a reasonable default; such a dependency could use a different syntax (e.g. somedep = { version = "...", link = ["cdylib-name"] }).

In place of lib = true, we could rename artifact and have a "lib" or similar value for that field. This would provide simpler syntax (with a single list of dependency types), but could potentially conflate different dependency types (since a "lib" dependency type would express a normal dependency on a Rust library, while "bin" would express an artifact dependency).

Instead of artifact = ["bin:binary-name", "bin:another-binary"] to specify dependencies on specific binaries, we could use a separate field bins = ["binary-name", "another-binary"]. This seems unnecessarily verbose, and separates the indication of an artifact dependency from the list of binaries.

As another alternative to specify dependencies on specific binaries, we could use table-based structures, such as: artifact = [{bin = ["binary-name", "another-binary"]}, "cdylib"]. This would avoid parsing values like bin:binary-name, but it seems excessively complex and excessively nested. Other variations on this theme seem similarly complex. The proposed syntax feels like the right balance.

Prior art

Unresolved questions

How easily can Cargo handle a dependency with a different target specified? How will that interact with dependency resolution? Cargo already has to handle dependencies for both host and target (for cross-compilation), so those cases should already work.

Future possibilities

Currently, there's no mechanism to obtain an environment variable's value at compile time if that value is not valid UTF-8. In the future, we may want macros like env_os! or env_path!, which return a &'static OsStr or &'static Path respectively, rather than a &'static str. This is already an issue for existing environment variables supplied to the build that contain file paths.

In some cases, a crate may want to depend on a binary without unifying features or dependency versions with that binary. A future extension to this mechanism could allow cargo to build a binary crate in isolation, without attempting to do any unification.

Just as a -sys crate can supply additional artifacts other than the built binary, this mechanism could potentially expand in the future to allow building artifacts other than the built binary, such as C-compatible include files, various types of interface definition or protocol definition files, or arbitrary data files.

If a dependency has a specific target (other than the host or target), and the target is not available, cargo can only emit an error at build time that tells the user to install the target. Some projects may wish to use rustup's support for rust-toolchain TOML files to specify targets they or their dependencies require. However, in the future, Cargo could have more native support for targets, either by downloading precompiled targets as rustup does, or by building support for those targets using build-std or equivalent. Integrating such support into Cargo would improve support for cross-compiled artifact dependencies.