RFC 2963: Rustdoc: json output

tools (rustdoc)

Summary

This RFC describes the design of a JSON output for the tool rustdoc, to allow tools to lean on its data collection and refinement but provide a different front-end.

Motivation

The current HTML output of rustdoc is often lauded as a key selling point of Rust. It's a ubiquitous tool, that you can use to easily find nearly anything you need to know about a crate. However, despite its versatility, its output format has some drawbacks:

In addition, rustdoc had JSON output in the past, but it failed to keep up with the changing language and was taken out in 2016. With rustdoc in a more stable position, it's possible to re-introduce this feature and ensure its stability. This was brought up in 2018 with a positive response and there are several recent discussions indicating that it would be a useful feature.

In the draft RFC from 2018 there was some discussion of utilizing save-analysis to provide this information, but with RLS being replaced by rust-analyzer it's possible that the feature will be eventually removed from the compiler. In addition save-analysis output is just as unstable as the current HTML output of rustdoc, so a separate format is preferable.

Guide-level explanation

(Upon successful implementation/stabilization, this documentation should live in The Rustdoc Book.)

In addition to generating the regular HTML, rustdoc can create a JSON file based on your crate. These can be used by other tools to take information about your crate and convert it into other output formats, insert into centralized documentation systems, create language bindings, etc.

To get this output, pass the --output-format json flag to rustdoc:

$ rustdoc lib.rs --output-format json

This will output a JSON file in the current directory (by default). For example, say you have the following crate:

//! Here are some crate-level docs!

/// Here are some docs for `some_fn`!
pub fn some_fn() {}

/// Here are some docs for `SomeStruct`!
pub struct SomeStruct;

After running the above command, you should get a lib.json file like the following:

{
  "root": "0:0",
  "version": null,
  "includes_private": false,
  "index": {
    "0:3": {
      "crate_id": 0,
      "name": "some_fn",
      "source": {
        "filename": "lib.rs",
        "begin": [4, 0],
        "end": [4, 19]
      },
      "visibility": "public",
      "docs": "Here are some docs for `some_fn`!",
      "attrs": [],
      "kind": "function",
      "inner": {
        "decl": {
          "inputs": [],
          "output": null,
          "c_variadic": false
        },
        "generics": {...},
        "header": "",
        "abi": "\"Rust\""
      }
    },
    "0:4": {
      "crate_id": 0,
      "name": "SomeStruct",
      "source": {
        "filename": "lib.rs",
        "begin": [7, 0],
        "end": [7, 22]
      },
      "visibility": "public",
      "docs": "Here are some docs for `SomeStruct`!",
      "attrs": [],
      "kind": "struct",
      "inner": {
        "struct_type": "unit",
        "generics": {...},
        "fields_stripped": false,
        "fields": [],
        "impls": [...]
      }
    },
    "0:0": {
      "crate_id": 0,
      "name": "lib",
      "source": {
        "filename": "lib.rs",
        "begin": [1, 0],
        "end": [7, 22]
      },
      "visibility": "public",
      "docs": "Here are some crate-level docs!",
      "attrs": [],
      "kind": "module",
      "inner": {
        "is_crate": true,
        "items": [
          "0:4",
          "0:3"
        ]
      }
    }
  },
  "paths": {
    "0:3": {
      "crate_id": 0,
      "path": ["lib", "some_fn"],
      "kind": "function"
    },
    "0:4": {
      "crate_id": 0,
      "path": ["lib", "SomeStruct"],
      "kind": "struct"
    },
    ...
  },
  "extern_crates": {
    "9": {
      "name": "backtrace",
      "html_root_url": "https://docs.rs/backtrace/"
      },
    "2": {
      "name": "core",
      "html_root_url": "https://doc.rust-lang.org/nightly/"
    },
    "1": {
      "name": "std",
      "html_root_url": "https://doc.rust-lang.org/nightly/"
    },
    ...
  }
}

Reference-level explanation

(Upon successful implementation/stabilization, this documentation should live in The Rustdoc Book and/or an external crate's Rustdoc.)

(Given that the JSON output will be implemented as a set of Rust types with serde serialization, the most useful docs for them would be the 40 or so types themselves. By writing docs on those types the Rustdoc page for that module would become a good reference. It may be helpful to provide some sort of schema for use with other languages)

When you request JSON output from rustdoc, you're getting a version of the Rust abstract syntax tree (AST), so you could see anything that you could export from a valid Rust crate. The following types can appear in the output:

ID

To provide various maps/references to items, the JSON output uses unique strings as IDs for each item. They happen to be the compiler internal DefId for that item, but in the JSON blob they should be treated as opaque as they aren't guaranteed to be stable across compiler invocations. IDs are only valid/consistent within a single JSON blob. They cannot be used to resolve references between the JSON output of different crates (see the Resolving IDs section).

Crate

A Crate is the root of the outputted JSON blob. It contains all doc-relevant information about the local crate, as well as some information about external items that are referred to locally.

NameTypeDescription
nameStringThe name of the crate. If --crate-name is not given, the filename is used.
versionString(Optional) The version string given to --crate-version, if any.
includes_privateboolWhether or not the output includes private items.
rootIDThe ID of the root module Item.
indexMap<ID, Item>A collection of all Items in the crate*.
pathsMap<ID, ItemSummary>Maps all IDs (even external ones*) to a brief description including their name, crate of origin, and kind.
extern_cratesMap<int, ExternalCrate>A map of "crate numbers" to metadata about that crate.
format_versionintThe version of the structure of this blob. The structure described by this RFC will be version 1, and it will be changed if incompatible changes are ever made.

Resolving IDs

The crate's index contains mostly local items, which includes impls of external traits on local types or local traits on external types. The exception to this is that external trait definitions and their associated items are also included in the index because this information is useful when generating the comprehensive list of methods for a type.

This means that many IDs aren't included in the index (any reference to a struct, macro, etc. from a different crate). In these cases the fallback is to look up the ID in the crate's paths. That gives enough information about the item to create cross references or simply provide a name without copying all of the information about external items into the local crate's JSON output.

ExternalCrate

NameTypeDescription
nameStringThe name of the crate.
html_root_urlString(Optional) The html_root_url for that crate if they specify one.

ItemSummary

NameTypeDescription
crate_idintA number corresponding to the crate this Item is from. Used as an key to the extern_crates map in Crate. A value of zero represents an Item from the local crate, any other number means that this Item is external.
path[String]The fully qualified path (e.g. ["std", "io", "lazy", "Lazy"] for std::io::lazy::Lazy) of this Item.
kindStringWhat type of Item this is (see Item).

Item

An Item represents anything that can hold documentation - modules, structs, enums, functions, traits, type aliases, and more. The Item data type holds fields that can apply to any of these, and leaves kind-specific details (like function args or enum variants) to the inner field.

NameTypeDescription
crate_idintA number corresponding to the crate this Item is from. Used as an key to the extern_crates map in Crate. A value of zero represents an Item from the local crate, any other number means that this Item is external.
nameStringThe name of the Item, if present. Some Items, like impl blocks, do not have names.
spanSpan(Optional) The source location of this Item.
visibilityString"default", "public", or "crate"*.
docsStringThe extracted documentation text from the Item.
linksMap<String, ID>A map of intra-doc link names to the IDs of the items they resolve to. For example if the docs string contained "see [HashMap][std::collections::HashMap] for more details" then links would have "std::collections::HashMap": "<some id>".
attrs[String]The unstable stringified attributes (other than doc comments) on the Item (e.g. ["#[inline]", "#[test]"]).
deprecationDeprecation(Optional) Information about the Item's deprecation, if present.
kindStringThe kind of Item this is. Determines what fields are in inner.
innerObjectThe type-specific fields describing this Item. Check the kind field to determine what's available.

Restricted visibility

When using --document-private-items, pub(in path) items can appear in the output in which case the visibility field will be an Object instead of a string. It will contain the single key "restricted" with the following values:

NameTypeDescription
parentIDThe ID of the module that this items visibility is restricted to.
pathStringHow that module path was referenced in the code (like "super::super", or "crate::foo").

kind == "module"

NameTypeDescription
items[ID]The list of Items contained within this module. The order of definitions is preserved.

kind == "function"

NameTypeDescription
declFnDeclInformation about the function signature, or declaration.
genericsGenericsInformation about the function's type parameters and where clauses.
headerString"const", "async", "unsafe", or a space separated combination of those modifiers.
abiStringThe ABI string on the function. Non-extern functions have a "Rust" ABI, whereas extern functions without an explicit ABI are "C". See the reference for more details.

kind == "struct" || "union"

NameTypeDescription
struct_typeStringEither "plain" for braced structs, "tuple" for tuple structs, or "unit" for unit structs.
genericsGenericsInformation about the struct's type parameters and where clauses.
fields_strippedboolWhether any fields have been removed from the result, due to being private or hidden.
fields[ID]The list of fields in the struct. All of the corresponding Items have kind == "struct_field".
impls[ID]All impls (both trait and inherent) for this type. All of the corresponding Items have kind = "impl"

kind == "struct_field"

NameTypeDescription
typeTypeThe type of this field.

kind == "enum"

NameTypeDescription
genericsGenericsInformation about the enum's type parameters and where clauses.
fields[ID]The list of variants in the enum. All of the corresponding Items have kind == "variant".
fields_strippedboolWhether any variants have been removed from the result, due to being private or hidden.
impls[ID]All impls (both trait and inherent) for this type. All of the corresponding Items have kind = "impl"

kind == "variant"

Has a variant_kind field with 3 possible values and an variant_inner field with more info if necessary:

kind == "trait"

NameTypeDescription
is_autoboolWhether this trait is an autotrait like Sync.
is_unsafeboolWhether this is an unsafe trait such as GlobalAlloc.
items[ID]The list of associated items contained in this trait definition.
genericsGenericsInformation about the trait's type parameters and where clauses.
bounds[GenericBound]Trait bounds for this trait definition (e.g. trait Foo: Bar<T> + Clone).

kind == "trait_alias"

An unstable feature which allows writing aliases like trait Foo = std::fmt::Debug + Send and then using Foo in bounds rather than writing out the individual traits.

NameTypeDescription
genericsGenericsAny type parameters that the trait alias takes.
bounds[GenericBound]The list of traits after the equals.

kind == "method"

NameTypeDescription
declFnDeclInformation about the method signature, or declaration.
genericsGenericsInformation about the method's type parameters and where clauses.
headerString"const", "async", "unsafe", or a space separated combination of those modifiers.
has_bodyboolWhether this is just a method signature (in a trait definition) or a method with an actual body.

kind == "assoc_const"

These items only show up in trait definitions. When looking at a trait impl item, the item where the associated constant is defined is a "constant" item.

NameTypeDescription
typeTypeThe type of this associated const.
defaultString(Optional) The stringified expression for the default value, if provided.

kind == "assoc_type"

These items only show up in trait definitions. When looking at a trait impl item, the item where the associated type is defined is a "typedef" item.

NameTypeDescription
bounds[GenericBound]The bounds for this associated type.
defaultType(Optional) The default for this type, if provided.

kind == "impl"

NameTypeDescription
is_unsafeboolWhether this impl is for an unsafe trait.
genericsGenericsInformation about the impl's type parameters and where clauses.
provided_trait_methods[String]The list of names for all provided methods in this impl block. This is provided for ease of access if you don't need more information from the items field.
traitType(Optional) The trait being implemented or null if the impl is "inherent", which means impl Struct {} as opposed to impl Trait for Struct {}.
forTypeThe type that the impl block is for.
items[ID]The list of associated items contained in this impl block.
negativeboolWhether this is a negative impl (e.g. !Sized or !Send).
syntheticboolWhether this is an impl that's implied by the compiler (for autotraits, e.g. Send or Sync).
blanket_implString(Optional) The name of the generic parameter used for the blanket impl, if this impl was produced by one. For example impl<T, U> Into<U> for T would result in blanket_impl == "T".

kind == "constant"

NameTypeDescription
typeTypeThe type of this constant.
exprStringThe unstable stringified expression of this constant.
valueString(Optional) The value of the evaluated expression for this constant, which is only computed for numeric types.
is_literalboolWhether this constant is a bool, numeric, string, or char literal.

kind == "static"

NameTypeDescription
typeTypeThe type of this static.
exprStringThe unstable stringified expression that this static is assigned to.
mutableboolWhether this static is mutable.

kind == "typedef"

NameTypeDescription
typeTypeThe type on the right hand side of this definition.
genericsGenericsAny generic parameters on the left hand side of this definition.

kind == "opaque_ty"

Represents trait aliases of the form:

type Foo<T> = Clone + std::fmt::Debug + Into<T>;
NameTypeDescription
bounds[GenericBound]The trait bounds on the right hand side.
genericsGenericsAny generic parameters on the type itself.

kind == "foreign_type"

inner contains no fields. This item represents a type declaration in an extern block (see here for more details):

extern {
    type Foo;
}

kind == "extern_crate"

NameTypeDescription
nameStringThe name of the extern crate.
renameString(Optional) The renaming of this crate with extern crate foo as bar.

kind == "import"

NameTypeDescription
sourceStringThe full path being imported (e.g. "super::some_mod::other_mod::Struct").
nameStringThe name of the imported item (may be different from the last segment of source due to import renaming: use source as name).
idID(Optional) The ID of the item being imported.
globboolWhether this import ends in a glob: use source::*.

kind == "macro"

A macro_rules! declarative macro. Contains a single string with the source representation of the macro with the patterns stripped, for example:

macro_rules! vec {
    () => { ... };
    ($elem:expr; $n:expr) => { ... };
    ($($x:expr),+ $(,)?) => { ... };
}

TODO: proc macros

Span

NameTypeDescription
filenameStringThe path to the source file for this span relative to the crate root.
begin(int, int)The zero indexed line and column of the first character in this span.
end(int, int)The zero indexed line and column of the last character in this span.

Deprecation

NameTypeDescription
sinceString(Optional) Usually a version number when this Item first became deprecated.
noteString(Optional) The reason for deprecation and/or what alternatives to use.

FnDecl

NameTypeDescription
inputs[(String, Type)]A list of parameter names and their types. The names are unstable because arbitrary patterns can be used as parameters, in which case the name is a pretty printed version of it. For example fn foo((_, x): (u32, u32)){…} would have an parameter with the name "(_, x)" and fn foo(MyStruct {some_field: u32, ..}: MyStruct){…}) would have one called "MyStruct {some_field, ..}".
outputType(Optional) Output type.
c_variadicboolWhether this function uses an unstable feature for variadic FFI functions.

Generics

NameTypeDescription
params[GenericParamDef]A list of generic parameter definitions (e.g. <T: Clone + Hash, U: Copy>).
where_predicates[WherePredicate]A list of where predicates (e.g. where T: Iterator, T::Item: Copy).

Examples

Here are a few full examples of the Generics fields for different rust code:

Lifetime bounds

pub fn foo<'a, 'b, 'c>(a: &'a str, b: &'b str, c: &'c str)
where
    'a: 'b + 'c, {…}
"generics": {
  "params": [
    {
      "name": "'a",
      "kind": "lifetime"
    },
    {
      "name": "'b",
      "kind": "lifetime"
    },
    {
      "name": "'c",
      "kind": "lifetime"
    }
  ],
  "where_predicates": [
    {
      "region_predicate": {
        "lifetime": "'a",
        "bounds": [
          {
            "outlives": "'b"
          },
          {
            "outlives": "'c"
          }
        ]
      }
    }
  ]

Trait bounds

pub fn bar<T, U: Clone>(a: T, b: U)
where
    T: Iterator,
    T::Item: Copy,
    U: Iterator<Item=u32>, {…}
"generics": {
  "params": [
    {
      "name": "T",
      "kind": {
        "type": {
          "bounds": [],
          "synthetic": false
        }
      }
    },
    {
      "name": "U",
      "kind": {
        "type": {
          "bounds": [
            {
              "trait_bound": {
                "trait": {/* `Type` representation for `Clone`*/},
                "generic_params": [],
                "modifier": "none"
              }
            }
          ],
          "synthetic": false
        }
      }
    }
  ],
  "where_predicates": [
    {
      "bound_predicate": {
        "ty": {
          "generic": "T"
        },
        "bounds": [
          {
            "trait_bound": {
              "trait": {/* `Type` representation for `Iterator`*/},
              "generic_params": [],
              "modifier": "none"
            }
          }
        ]
      }
    },
    {
      "bound_predicate": {
        "ty": {/* `Type` representation for `Iterator::Item`},
        "bounds": [
          {
            "trait_bound": {
              "trait": {/* `Type` representation for `Copy`*/},
              "generic_params": [],
              "modifier": "none"
            }
          }
        ]
      }
    },
    {
      "bound_predicate": {
        "ty": {
          "generic": "U"
        },
        "bounds": [
          {
            "trait_bound": {
              "trait": {/* `Type` representation for `Iterator<Item=u32>`*/},
              "generic_params": [],
              "modifier": "none"
            }
          }
        ]
      }
    }
  ]
}

GenericParamDef

NameTypeDescription
nameStringThe name of the type variable of a generic parameter (e.g T or 'static)
kindObjectEither "lifetime", "const": Type, or "type: Object" with the following fields:
NameTypeDescription
bounds[GenericBound]The bounds on this parameter.
defaultType(Optional) The default type for this parameter (e.g PartialEq<Rhs = Self>).

WherePredicate

Can be one of the 3 following objects:

GenericBound

Can be either "trait_bound" with the following fields:

NameTypeDescription
traitTypeThe trait for this bound.
modifierStringEither "none", "maybe", or "maybe_const"
generic_params[GenericParamDef]for<> parameters used for HRTBs

Type

Rustdoc's representation of types is fairly involved. Like Items, they are represented by a "kind" field and an "inner" field with the related information. Here are the possible contents of that inner Object:

kind = "resolved_path"

This is the main kind that represents all user defined types.

NameTypeDescription
nameStringThe path of this type as written in the code ("std::iter::Iterator", "::module::Struct", etc.).
argsGenericArgs(Optional) Any arguments on this type such as Vec<i32> or SomeStruct<'a, 5, u8, B: Copy, C = 'static str>.
idIDThe ID of the trait/struct/enum/etc. that this type refers to.
param_namesGenericBoundIf this type is of the form dyn Foo + Bar + ... then this field contains those trait bounds.

GenericArgs

Can be either "angle_bracketed" with the following fields:

NameTypeDescription
args[GenericArg]The list of each argument on this type.
bindingsTypeBindingAssociated type or constant bindings (e.g. Item=i32 or Item: Clone) for this type.

or "parenthesized" (for Fn(A, B) -> C arg syntax) with the following fields:

NameTypeDescription
inputs[Type]The Fn's parameter types for this argument.
outputType(Optional) The return type of this argument.

GenericArg

Can be one of the 3 following objects:

TypeBinding

NameTypeDescription
nameStringThe Fn's parameter types for this argument.
bindingObjectEither "equality": Type or "constraint": [GenericBound]

kind = "generic"

"inner"' is a String which is simply the name of a type parameter.

kind = "tuple"

"inner" is a single list with the Types of each tuple item.

kind = "slice"

"inner" is the Type the elements in the slice.

kind = "array"

NameTypeDescription
typeTypeThe Type of the elements in the array
lenStringThe length of the array as an unstable stringified expression.

kind = "impl_trait"

"inner" is a single list of the GenericBounds for this type.

kind = "never"

Used to represent the ! type, has no fields.

kind = "infer"

Used to represent _ in type parameters, has no fields.

kind = "function_pointer"

NameTypeDescription
is_unsafeboolWhether this is an unsafe fn.
declFnDeclInformation about the function signature, or declaration.
params[GenericParamDef]A list of generic parameter definitions (e.g. <T: Clone + Hash, U: Copy>).
abiStringThe ABI string on the function.

kind = "raw_pointer"

NameTypeDescription
mutableboolWhether this is a *mut or just a *.
typeTypeThe Type that this pointer points at.

kind = "borrowed_ref"

NameTypeDescription
lifetimeString(Optional) The name of the lifetime parameter on this referece, if any.
mutableboolWhether this is a &mut or just a &.
typeTypeThe Type that this reference references.

kind = "qualified_path"

Used when a type is qualified by a trait (<Type as Trait>::Name) or associated type (T::Item where T: Iterator).

NameTypeDescription
nameStringThe name at the end of the path ("Name" and "Item" in the examples above).
self_typeTypeThe type being used as a trait (Type and T in the examples above).
traitTypeThe trait that the path is on (Trait and Iterator in the examples above).

Examples

Here are some function signatures with various types and their respective JSON representations:

Primitives

pub fn primitives(a: u32, b: (u32, u32), c: [u32], d: [u32; 5]) -> *mut u32 {}
"decl": {
  "inputs": [
    [
      "a",
      {
        "kind": "primitive",
        "inner": "u32"
      }
    ],
    [
      "b",
      {
        "kind": "tuple",
        "inner": [
          {
            "kind": "primitive",
            "inner": "u32"
          },
          {
            "kind": "primitive",
            "inner": "u32"
          }
        ]
      }
    ],
    [
      "c",
      {
        "kind": "slice",
        "inner": {
          "kind": "primitive",
          "inner": "u32"
        }
      }
    ],
    [
      "d",
      {
        "kind": "array",
        "inner": {
          "type": {
            "kind": "primitive",
            "inner": "u32"
          },
          "len": "5"
        }
      }
    ]
  ],
  "output": {
    "kind": "raw_pointer",
    "inner": {
      "mutable": true,
      "type": {
        "kind": "primitive",
        "inner": "u32"
      }
    }
  }
}

References

pub fn references<'a>(a: &'a mut str) -> &'static MyType {}
"decl": {
  "inputs": [
    [
      "a",
      {
        "kind": "borrowed_ref",
        "inner": {
          "lifetime": "'a",
          "mutable": true,
          "type": {
            "kind": "primitive",
            "inner": "str"
          }
        }
      }
    ]
  ],
  "output": {
    "kind": "borrowed_ref",
    "inner": {
      "lifetime": "'static",
      "mutable": false,
      "type": {
        "kind": "resolved_path",
        "inner": {
          "name": "String",
          "id": "5:4936",
          "args": {
            "angle_bracketed": {
              "args": [],
              "bindings": []
            }
          },
          "param_names": null
        }
      }
    }
  }
}

Generics

pub fn generics<T>(a: T, b: impl Iterator<Item = bool>) -> ! {}
"decl": {
  "inputs": [
    [
      "a",
      {
        "kind": "generic",
        "inner": "T"
      }
    ],
    [
      "b",
      {
        "kind": "impl_trait",
        "inner": [
          {
            "trait_bound": {
              "trait": {
                "kind": "resolved_path",
                "inner": {
                  "name": "Iterator",
                  "id": "2:5000",
                  "args": {
                    "angle_bracketed": {
                      "args": [],
                      "bindings": [
                        {
                          "name": "Item",
                          "binding": {
                            "equality": {
                              "kind": "primitive",
                              "inner": "bool"
                            }
                          }
                        }
                      ]
                    }
                  },
                  "param_names": null
                }
              },
              "generic_params": [],
              "modifier": "none"
            }
          }
        ]
      }
    ]
  ],
  "output": {
    "kind": "never"
  }
}

Generic Args

pub trait MyTrait<'a, T> {
    type Item;
    type Other;
}

pub fn generic_args<'a>(x: impl MyTrait<'a, i32, Item = u8, Other = f32>) {
    unimplemented!()
}
"decl": {
  "inputs": [
    [
      "x",
      {
        "kind": "impl_trait",
        "inner": [
          {
            "trait_bound": {
              "trait": {
                "kind": "resolved_path",
                "inner": {
                  "name": "MyTrait",
                  "id": "0:11",
                  "args": {
                    "angle_bracketed": {
                      "args": [
                        {
                          "lifetime": "'a"
                        },
                        {
                          "type": {
                            "kind": "primitive",
                            "inner": "i32"
                          }
                        }
                      ],
                      "bindings": [
                        {
                          "name": "Item",
                          "binding": {
                            "equality": {
                              "kind": "primitive",
                              "inner": "u8"
                            }
                          }
                        },
                        {
                          "name": "Other",
                          "binding": {
                            "equality": {
                              "kind": "primitive",
                              "inner": "f32"
                            }
                          }
                        }
                      ]
                    }
                  },
                  "param_names": null
                }
              },
              "generic_params": [],
              "modifier": "none"
            }
          }
        ]
      }
    ]
  ],
  "output": null
}

Unstable

Fields marked as unstable have contents that are subject to change. They can be displayed to users, but tools shouldn't rely on being able to parse their output or they will be broken by internal compiler changes.

Drawbacks

Alternatives

Prior art

A handful of other languages and systems have documentation tools that output an intermediate representation separate from the human-readable outputs:

Unresolved questions

Output structure questions

These aren't essential and could be deferred to a later RFC. The current implementation does include spans, but doesn't do any of the other things mentioned here.