RFC 1422: pub-restricted

lang (paths | privacy)

Summary

Expand the current pub/non-pub categorization of items with the ability to say "make this item visible solely to a (named) module tree."

The current crate is one such tree, and would be expressed via: pub(crate) item. Other trees can be denoted via a path employed in a use statement, e.g. pub(a::b) item, or pub(super) item.

Motivation

Right now, if you have a definition for an item X that you want to use in many places in a module tree, you can either (1.) define X at the root of the tree as a non-pub item, or (2.) you can define X as a pub item in some submodule (and import into the root of the module tree via use).

But: Sometimes neither of these options is really what you want.

There are scenarios where developers would like an item to be visible to a particular module subtree (or a whole crate in its entirety), but it is not possible to move the item's (non-pub) definition to the root of that subtree (which would be the usual way to expose an item to a subtree without making it pub).

If the definition of X itself needs access to other private items within a submodule of the tree, then X cannot be put at the root of the module tree. Illustration:

// Intent: `a` exports `I`, `bar`, and `foo`, but nothing else.
pub mod a {
    pub const I: i32 = 3;

    // `semisecret` will be used "many" places within `a`, but
    // is not meant to be exposed outside of `a`.
    fn semisecret(x: i32) -> i32  { use self::b::c::J; x + J }

    pub fn bar(z: i32) -> i32 { semisecret(I) * z }
    pub fn foo(y: i32) -> i32 { semisecret(I) + y }

    mod b {
        mod c {
            const J: i32 = 4; // J is meant to be hidden from the outside world.
        }
    }
}

(Note: the pub mod a is meant to be at the root of some crate.)

The latter code fails to compile, due to the privacy violation where the body of fn semisecret attempts to access a::b::c::J, which is not visible in the context of a.

A standard way to deal with this today is to use the second approach described above (labelled "(2.)"): move fn semisecret down into the place where it can access J, marking fn semisecret as pub so that it can still be accessed within the items of a, and then re-exporting semisecret as necessary up the module tree.

// Intent: `a` exports `I`, `bar`, and `foo`, but nothing else.
pub mod a {
    pub const I: i32 = 3;

    // `semisecret` will be used "many" places within `a`, but
    // is not meant to be exposed outside of `a`.
    // (If we put `pub use` here, then *anyone* could access it.)
    use self::b::semisecret;

    pub fn bar(z: i32) -> i32 { semisecret(I) * z }
    pub fn foo(y: i32) -> i32 { semisecret(I) + y }

    mod b {
        pub use self::c::semisecret;
        mod c {
            const J: i32 = 4; // J is meant to be hidden from the outside world.
            pub fn semisecret(x: i32) -> i32  { x + J }
        }
    }
}

This works, but there is a serious issue with it: One cannot easily tell exactly how "public" fn semisecret is. In particular, understanding who can access semisecret requires reasoning about (1.) all of the pub use's (aka re-exports) of semisecret, and (2.) the pub-ness of every module in a path leading to fn semisecret or one of its re-exports.

This RFC seeks to remedy the above problem via two main changes.

  1. Give the user a way to explicitly restrict the intended scope of where a pub-licized item can be used.

  2. Modify the privacy rules so that pub-restricted items cannot be used nor re-exported outside of their respective restricted areas.

Impact

This difficulty in reasoning about the "publicness" of a name is not just a problem for users; it also complicates efforts within the compiler to verify that a surface API for a type does not itself use or expose any private names.

There are a number of bugs filed against privacy checking; some are simply implementation issues, but the comment threads in the issues make it clear that in some cases, different people have very different mental models about how privacy interacts with aliases (e.g. type declarations) and re-exports.

In theory, we can add the changes of this RFC without breaking any old code. (That is, in principle the only affected code is that for item definitions that use pub(restriction). This limited addition would still provide value to users in their reasoning about the visibility of such items.)

In practice, I expect that as part of the implementation of this RFC, we will probably fix pre-existing bugs in the parts of privacy checking verifying that surface API's do not use or expose private names.

Important: No such fixes to such pre-existing bugs are being concretely proposed by this RFC; I am merely musing that by adding a more expressive privacy system, we will open the door to fix bugs whose exploits, under the old system, were the only way to express certain patterns of interest to developers.

Detailed design

The main problem identified in the motivation section is this:

From an module-internal definition like

pub mod a { [...] mod b { [...] pub fn semisecret(x: i32) -> i32  { x + J } [...] } }

one cannot readily tell exactly how "public" the fn semisecret is meant to be.

As already stated, this RFC seeks to remedy the above problem via two main changes.

  1. Give the user a way to explicitly restrict the intended scope of where a pub-licized item can be used.

  2. Modify the privacy rules so that pub-restricted items cannot be used nor re-exported outside of their respective restricted areas.

Syntax

The new feature is to restrict the scope by adding the module subtree (which acts as the restricted area) in parentheses after the pub keyword, like so:

pub(a::b::c) item;

The path in the restriction is resolved just like a use statement: it is resolved absolutely, from the crate root.

Just like use statements, one can also write relative paths, by starting them with self or a sequence of super's.

pub(super::super) item;
// or
pub(self) item; // (semantically equiv to no `pub`; see below)

In addition to the forms analogous to use, there is one new form:

pub(crate) item;

In other words, the grammar is changed like so:

old:

VISIBILITY ::= <empty> | `pub`

new:

VISIBILITY ::= <empty> | `pub` | `pub` `(` USE_PATH `)` | `pub` `(` `crate` `)`

One can use these pub(restriction) forms anywhere that one can currently use pub. In particular, one can use them on item definitions, methods in an impl, the fields of a struct definition, and on pub use re-exports.

Semantics

The meaning of pub(restriction) is as follows: The definition of every item, method, field, or name (e.g. a re-export) is associated with a restriction.

A restriction is either: the universe of all crates (aka "unrestricted"), the current crate, or an absolute path to a module sub-hierarchy in the current crate. A restricted thing cannot be directly "used" in source code outside of its restricted area. (The term "used" here is meant to cover both direct reference in the source, and also implicit reference as the inferred type of an expression or pattern.)

As noted above, the definition means that pub(self) item is the same as if one had written just item.

NOTE: even if the restriction of an item or name indicates that it is accessible in some context, it may still be impossible to reference it. In particular, we will still keep our existing rules regarding pub items defined in non-pub modules; such items would have no restriction, but still may be inaccessible if they are not re-exported in some manner.

Revised Example

In the running example, one could instead write:

// Intent: `a` exports `I`, `bar`, and `foo`, but nothing else.
pub mod a {
    pub const I: i32 = 3;

    // `semisecret` will be used "many" places within `a`, but
    // is not meant to be exposed outside of `a`.
    // (`pub use` would be *rejected*; see Note 1 below)
    use self::b::semisecret;

    pub fn bar(z: i32) -> i32 { semisecret(I) * z }
    pub fn foo(y: i32) -> i32 { semisecret(I) + y }

    mod b {
        pub(a) use self::c::semisecret;
        mod c {
            const J: i32 = 4; // J is meant to be hidden from the outside world.

            // `pub(a)` means "usable within hierarchy of `mod a`, but not
            // elsewhere."
            pub(a) fn semisecret(x: i32) -> i32  { x + J }
        }
    }
}

Note 1: The compiler would reject the variation of the above written as:

pub mod a { [...] pub use self::b::semisecret; [...] }

because pub(a) fn semisecret says that it cannot be used outside of a, and therefore it be incorrect (or at least useless) to reexport semisecret outside of a.

Note 2: The most direct interpretation of the rules here leads me to conclude that b's re-export of semisecret needs to be restricted to a as well. However, it may be possible to loosen things so that the re-export could just stay as pub with no extra restriction; see discussion of "IRS:PUNPM" in Unresolved Questions.

This richer notion of privacy does offer us some other ways to re-write the running example; instead of defining fn semisecret within c so that it can access J, we might instead expose J to mod b and then put fn semisecret, like so:

pub mod a {
    [...]
    mod b {
        use self::c::J;
        pub(a) fn semisecret(x: i32) -> i32  { x + J }
        mod c {
            pub(b) const J: i32 = 4;
        }
    }
}

(This RFC takes no position on which of the above two structures is "better"; a toy example like this does not provide enough context to judge.)

Restrictions

Lets discuss what the restrictions actually mean.

Some basic definitions: An item is just as it is declared in the Rust reference manual: a component of a crate, located at a fixed path (potentially at the "outermost" anonymous module) within the module tree of the crate.

Every item can be thought of as having some hidden implementation component(s) along with an exposed surface API.

So, for example, in pub fn foo(x: Input) -> Output { Body }, the surface of foo includes Input and Output, while the Body is hidden.

The pre-existing privacy rules (both prior to and after this RFC) try to enforce two things: (1.) when a item references a path, all of the names on that path need to be visible (in terms of privacy) in the referencing context and, (2.) private items should not be exposed in the surface of public API's.

This RFC is expanding the scope of (2.) above, so that the rules are now:

  1. when a item references a path (in its implementation or in its signature), all of the names on that path must be visible in the referencing context.

  2. items restricted to an area R should not be exposed in the surface API of names or items that can themselves be exported beyond R. (Privacy is now a special case of this more general notion.)

    For convenience, it is legal to declare a field (or inherent method) with a strictly larger area of restriction than its self. See discussion in the examples.

In principle, validating (1.) can be done via the pre-existing privacy code. (However, it may make sense to do it by mapping each name to its associated restriction; I don't think that will change the outcome, but it might make the checking code simpler. But I am not an expert on the current state of the privacy checking code.)

Validating (2.) requires traversing the surface API for each item and comparing the restriction for every reference to the restriction of the item itself.

Trait methods

Currently, trait associated item syntax carries no pub modifier.

A question arises when trying to apply the terminology of this RFC: are trait associated items implicitly pub, in the sense that they are unrestricted?

The simple answer is: No, associated items are not implicitly pub; at least, not in general. (They are not in general implicitly pub today either, as discussed in RFC 136.) (If they were implictly pub, things would be difficult; further discussion in attached appendix.)

However, since this RFC is introducing multiple kinds of pub, we should address the topic of what is the pub-ness of associated items.

More examples!

These examples meant to explore the syntax a bit. They are not meant to provide motivation for the feature (i.e. I am not claiming that the feature is making this code cleaner or easier to reason about).

Impl item example

pub struct S(i32);

mod a {
    pub fn call_foo(s: &super::S) { s.foo(); }

    mod b {
        fn some_method_private_to_b() {
            println!("inside some_method_private_to_b");
        }

        impl super::super::S {
            pub(a) fn foo(&self) {
                some_method_private_to_b();
                println!("only callable within `a`: {}", self.0);
            }
        }
    }
}

fn rejected(s: &S) {
    s.foo(); //~ ERROR: `S::foo` not visible outside of module `a`
}

(You may be wondering: "Could we move that impl S out to the top-level, out of mod a?" Well ... see discussion in the unresolved questions.)

Restricting fields example

mod a {
    #[derive(Default)]
    struct Priv(i32);

    pub mod b {
        use a::Priv as Priv_a;

        #[derive(Default)]
        pub struct F {
            pub    x: i32,
                   y: Priv_a,
            pub(a) z: Priv_a,
        }

        #[derive(Default)]
        pub struct G(pub i32, Priv_a, pub(a) Priv_a);

        // ... accesses to F.{x,y,z} ...
        // ... accesses to G.{0,1,2} ...
    }
    // ... accesses to F.{x,z} ...
    // ... accesses to G.{0,2} ...
}

mod k {
    use a::b::{F, G};
    // ... accesses to F and F.x ...
    // ... accesses to G and G.0 ...
}

Fields and inherent methods more public than self

In Rust today, one can write

mod a { struct X { pub y: i32, } }

This RFC was crafted to say that fields and inherent methods can have an associated restriction that is larger than the restriction of its self. This was both to keep from breaking the above code, and also because it would be annoying to be forced to write:

mod a { struct X { pub(a) y: i32, } }

(This RFC is not an attempt to resolve things like Rust Issue 30079; the decision of how to handle that issue can be dealt with orthogonally, in my opinion.)

So, under this RFC, the following is legal:

mod a {
    pub use self::b::stuff_with_x;
    mod b {
        struct X { pub y: i32, pub(a) z: i32 }
        mod c {
            impl super::X {
                pub(c) fn only_in_c(&mut self) { self.y += 1; }

                pub fn callanywhere(&mut self) {
                    self.only_in_c();
                    println!("X.y is now: {}", self.y);
                }
            }
        }
        pub fn stuff_with_x() {
            let mut x = X { y: 10, z: 20};
            x.callanywhere();
        }
    }
}

In particular:

Re-exports

Here is an example of a pub use re-export using the new feature, including both correct and invalid uses of the extended form.

mod a {
    mod b {
        pub(a) struct X { pub y: i32, pub(a) z: i32 } // restricted to `mod a` tree
        mod c {
            pub mod d {
                pub(super) use a::b::X as P; // ok: a::b::c is submodule of `a`
            }

            fn swap_ok(x: d::P) -> d::P { // ok: `P` accessible here
                X { z: x.y, y: x.z }
            }
        }

        fn swap_bad(x: c::d::P) -> c::d::P { //~ ERROR: `c::d::P` not visible outside `a::b::c`
            X { z: x.y, y: x.z }
        }

        mod bad {
            pub use super::X; //~ ERROR: `X` cannot be reexported outside of `a`
        }
    }

    fn swap_ok2(x: X) -> X { // ok: `X` accessible from `mod a`.
        X { z: x.y, y: x.z }
    }
}

Crate restricted visibility

This is a concrete illusration of how one might use the pub(crate) item form, (which is perhaps quite similar to Java's default "package visibility").

Crate c1:

pub mod a {
    struct Priv(i32);

    pub(crate) struct R { pub y: i32, z: Priv } // ok: field allowed to be more public
    pub        struct S { pub y: i32, z: Priv }

    pub fn to_r_bad(s: S) -> R { ... } //~ ERROR: `R` restricted solely to this crate

    pub(crate) fn to_r(s: S) -> R { R { y: s.y, z: s.z } } // ok: restricted to crate
}

use a::{R, S}; // ok: `a::R` and `a::S` are both visible

pub use a::R as ReexportAttempt; //~ ERROR: `a::R` restricted solely to this crate

Crate c2:

extern crate c1;

use c1::a::S; // ok: `S` is unrestricted

use c1::a::R; //~ ERROR: `c1::a::R` not visible outside of its crate

Precedent

When I started on this I was not sure if this form of delimited access to a particular module subtree had a precedent; the closest thing I could think of was C++ friend modifiers (but friend is far more ad-hoc and free-form than what is being proposed here).

Scala

It has since been pointed out to me that Scala has scoped access modifiers protected[Y] and private[Y], which specify that access is provided upto Y (where Y can be a package, class or singleton object).

The feature proposed by this RFC appears to be similar in intent to Scala's scoped access modifiers.

Having said that, I will admit that I am not clear on what distinction, if any, Scala draws between protected[Y] and private[Y] when Y is a package, which is the main analogy for our purposes, or if they just allow both forms as synonyms for convenience.

(I can imagine a hypothetical distinction in Scala when Y is a class, but my skimming online has not provided insight as to what the actual distinction is.)

Even if there is some distinction drawn between the two forms in Scala, I suspect Rust does not need an analogous distinction in it's pub(restricted)

Drawbacks

Obviously, pub(restriction) item complicates the surface syntax of the language.

Developers may misuse this form and make it hard to access the tasty innards of other modules.

Alternatives

Do not extend the language!

In addition, these two alternatives do not address the main point being made in the motivation section: one cannot tell exactly how "public" a pub item is, without working backwards through the module tree for all of its re-exports.

Curb your ambitions!

Be more ambitious!

This feature could be extended in various ways.

For example:

Unresolved questions

Can definition site fall outside restriction?

For example, is it illegal to do the following:

mod a {
  mod child { }
  mod b { pub(super::child) const J: i32 = 3; }
}

Or does it just mean that J, despite being defined in mod b, is itself not accessible in mod b?

pnkfelix is personally inclined to make this sort of thing illegal, mainly because he finds it totally unintuitive, but is interested in hearing counter-arguments.

Implicit Restriction Satisfaction (IRS:PUNPM)

If a re-export occurs within a non-pub module, can we treat it as implicitly satisfying a restriction to super imposed by the item it is re-exporting?

In particular, the revised example included:

// Intent: `a` exports `I` and `foo`, but nothing else.
pub mod a {
    [...]
    mod b {
        pub(a) use self::c::semisecret;
        mod c { pub(a) fn semisecret(x: i32) -> i32  { x + J } }
    }
}

However, since b is non-pub, its pub items and re-exports are solely accessible via the subhierarchy of its module parent (i.e., mod a, as long as no entity attempts to re-export them to a braoder scope.

In other words, in some sense mod b { pub use item; } could implicitly satisfy a restriction to super imposed by item (if we chose to allow it).

Note: If it were pub mod b or pub(restrict) mod b, then the above reasoning would not hold. Therefore, this discussion is limited to re-exports from non-pub modules.

If we do not allow such implicit restriction satisfaction for pub use re-exports from non-pub modules (IRS:PUNPM), then:

pub mod a {
    [...]
    mod b {
        pub use self::c::semisecret;
        mod c { pub(a) fn semisecret(x: i32) -> i32  { x + J } }
    }
}

would be rejected, and one would be expected to write either:

        pub(super) use self::c::semisecret;

or

        pub(a) use self::c::semisecret;

(Side note: I am not saying that under IRS:PUNPM, the two forms pub use item and pub(super) use item would be considered synonymous, even in the context of a non-pub module like mod b. In particular, pub(super) use item may be imposing a new restriction on the re-exported name that was not part of its original definition.)

Interaction with Globs

Glob re-exports currently only re-export pub (as in pub(universe) items).

What should glob-reepxorts do with respect to pub(restricted)?

Here is an illustrating example pointed out by petrochenkov in the comment thread:

mod m {
    /*priv*/ pub(m) struct S1;
    pub(super) S2;
    pub(foo::bar) S3;
    pub S4;

    mod n {

        // What is reexported here?
        // Just `S4`?
        // Anything in `m` visible
        //  to `n` (which is not consisent with the current treatment of
        `pub` by globs).

        pub use m::*;
    }
}

// What is reexported here?
pub use m::*;
pub(baz::qux) use m::*;

This remains an unresolved question, but my personal inclination, at least for the initial implementation, is to make globs only import purely pub items; no non-pub, and no pub(restricted).

After we get more experience with pub(restricted) (and perhaps make other changes that may come in future RFCs), we will be in a better position to evaluate what to do here.

Appendices

Associated Items Digression

If associated items were implicitly pub, in the sense that they are unrestricted, then that would conflict with the rules imposed by this RFC, in the sense that the surface API of a non-pub trait is composed of its associated items, and so if all associated items were implicitly pub and unrestricted, then this code would be rejected:

mod a {
    struct S(String);
    trait Trait {
        fn mk_s(&self) -> S; // is this implicitly `pub` and unrestricted?
    }
    impl Trait for () { fn mk_s(&self) -> S { S(format!("():()")) } }
    impl Trait for i32 { fn mk_s(&self) -> S { S(format!("{}:i32", self)) } }
    pub fn foo(x:i32) -> String { format!("silly{}{}", ().mk_s().0, x.mk_s().0) }
}

If associated items were implicitly pub and unrestricted, then the above code would be rejected under direct interpretation of the rules of this RFC (because fn make_s is implicitly unrestricted, but the surface of fn make_s references S, a non-pub item). This would be backwards-incompatible (and just darn inconvenient too).

So, to be clear, this RFC is not suggesting that associated items be implicitly pub and unrestricted.