rustdoc: use JS to inline target type impl docs into alias
Preview docs:
- https://notriddle.com/rustdoc-html-demo-5/js-trait-alias/std/io/type.Result.html
- https://notriddle.com/rustdoc-html-demo-5/js-trait-alias-compiler/rustc_middle/ty/type.PolyTraitRef.html
This pull request also includes a bug fix for trait alias inlining across crates. This means more documentation is generated, and is why ripgrep runs slower (it's a thin wrapper on top of the `grep` crate, so 5% of its docs are now the Result type).
- Before, built with rustdoc 1.75.0-nightly (aa1a71e9e 2023-10-26), Result type alias method docs are missing: http://notriddle.com/rustdoc-html-demo-5/ripgrep-js-nightly/rg/type.Result.html
- After, built with this branch, all the methods on Result are shown: http://notriddle.com/rustdoc-html-demo-5/ripgrep-js-trait-alias/rg/type.Result.html
*Review note: This is mostly just reverting https://github.com/rust-lang/rust/pull/115201. The last commit has the new work in it.*
Fixes#115718
This is an attempt to balance three problems, each of which would
be violated by a simpler implementation:
- A type alias should show all the `impl` blocks for the target
type, and vice versa, if they're applicable. If nothing was
done, and rustdoc continues to match them up in HIR, this
would not work.
- Copying the target type's docs into its aliases' HTML pages
directly causes far too much redundant HTML text to be generated
when a crate has large numbers of methods and large numbers
of type aliases.
- Using JavaScript exclusively for type alias impl docs would
be a functional regression, and could make some docs very hard
to find for non-JS readers.
- Making sure that only applicable docs are show in the
resulting page requires a type checkers. Do not reimplement
the type checker in JavaScript.
So, to make it work, rustdoc stashes these type-alias-inlined docs
in a JSONP "database-lite". The file is generated in `write_shared.rs`,
included in a `<script>` tag added in `print_item.rs`, and `main.js`
takes care of patching the additional docs into the DOM.
The format of `trait.impl` and `type.impl` JS files are superficially
similar. Each line, except the JSONP wrapper itself, belongs to a crate,
and they are otherwise separate (rustdoc should be idempotent). The
"meat" of the file is HTML strings, so the frontend code is very simple.
Links are relative to the doc root, though, so the frontend needs to fix
that up, and inlined docs can reuse these files.
However, there are a few differences, caused by the sophisticated
features that type aliases have. Consider this crate graph:
```text
---------------------------------
| crate A: struct Foo<T> |
| type Bar = Foo<i32> |
| impl X for Foo<i8> |
| impl Y for Foo<i32> |
---------------------------------
|
----------------------------------
| crate B: type Baz = A::Foo<i8> |
| type Xyy = A::Foo<i8> |
| impl Z for Xyy |
----------------------------------
```
The type.impl/A/struct.Foo.js JS file has a structure kinda like this:
```js
JSONP({
"A": [["impl Y for Foo<i32>", "Y", "A::Bar"]],
"B": [["impl X for Foo<i8>", "X", "B::Baz", "B::Xyy"], ["impl Z for Xyy", "Z", "B::Baz"]],
});
```
When the type.impl file is loaded, only the current crate's docs are
actually used. The main reason to bundle them together is that there's
enough duplication in them for DEFLATE to remove the redundancy.
The contents of a crate are a list of impl blocks, themselves
represented as lists. The first item in the sublist is the HTML block,
the second item is the name of the trait (which goes in the sidebar),
and all others are the names of type aliases that successfully match.
This way:
- There's no need to generate these files for types that have no aliases
in the current crate. If a dependent crate makes a type alias, it'll
take care of generating its own docs.
- There's no need to reimplement parts of the type checker in
JavaScript. The Rust backend does the checking, and includes its
results in the file.
- Docs defined directly on the type alias are dropped directly in the
HTML by `render_assoc_items`, and are accessible without JavaScript.
The JSONP file will not list impl items that are known to be part
of the main HTML file already.
[JSONP]: https://en.wikipedia.org/wiki/JSONP
This is an attempt to balance three problems, each of which would
be violated by a simpler implementation:
- A type alias should show all the `impl` blocks for the target
type, and vice versa, if they're applicable. If nothing was
done, and rustdoc continues to match them up in HIR, this
would not work.
- Copying the target type's docs into its aliases' HTML pages
directly causes far too much redundant HTML text to be generated
when a crate has large numbers of methods and large numbers
of type aliases.
- Using JavaScript exclusively for type alias impl docs would
be a functional regression, and could make some docs very hard
to find for non-JS readers.
- Making sure that only applicable docs are show in the
resulting page requires a type checkers. Do not reimplement
the type checker in JavaScript.
So, to make it work, rustdoc stashes these type-alias-inlined docs
in a JSONP "database-lite". The file is generated in `write_shared.rs`,
included in a `<script>` tag added in `print_item.rs`, and `main.js`
takes care of patching the additional docs into the DOM.
The format of `trait.impl` and `type.impl` JS files are superficially
similar. Each line, except the JSONP wrapper itself, belongs to a crate,
and they are otherwise separate (rustdoc should be idempotent). The
"meat" of the file is HTML strings, so the frontend code is very simple.
Links are relative to the doc root, though, so the frontend needs to fix
that up, and inlined docs can reuse these files.
However, there are a few differences, caused by the sophisticated
features that type aliases have. Consider this crate graph:
```text
---------------------------------
| crate A: struct Foo<T> |
| type Bar = Foo<i32> |
| impl X for Foo<i8> |
| impl Y for Foo<i32> |
---------------------------------
|
----------------------------------
| crate B: type Baz = A::Foo<i8> |
| type Xyy = A::Foo<i8> |
| impl Z for Xyy |
----------------------------------
```
The type.impl/A/struct.Foo.js JS file has a structure kinda like this:
```js
JSONP({
"A": [["impl Y for Foo<i32>", "Y", "A::Bar"]],
"B": [["impl X for Foo<i8>", "X", "B::Baz", "B::Xyy"], ["impl Z for Xyy", "Z", "B::Baz"]],
});
```
When the type.impl file is loaded, only the current crate's docs are
actually used. The main reason to bundle them together is that there's
enough duplication in them for DEFLATE to remove the redundancy.
The contents of a crate are a list of impl blocks, themselves
represented as lists. The first item in the sublist is the HTML block,
the second item is the name of the trait (which goes in the sidebar),
and all others are the names of type aliases that successfully match.
This way:
- There's no need to generate these files for types that have no aliases
in the current crate. If a dependent crate makes a type alias, it'll
take care of generating its own docs.
- There's no need to reimplement parts of the type checker in
JavaScript. The Rust backend does the checking, and includes its
results in the file.
- Docs defined directly on the type alias are dropped directly in the
HTML by `render_assoc_items`, and are accessible without JavaScript.
The JSONP file will not list impl items that are known to be part
of the main HTML file already.
[JSONP]: https://en.wikipedia.org/wiki/JSONP
This is shorter, avoids potential conflicts with a crate
named `implementors`[^1], and will be less confusing when JS
include files are added for type aliases.
[^1]: AFAIK, this couldn't actually cause any problems right now,
but it's simpler just to make it impossible than relying on never
having a file named `trait.Foo.js` in the crate data area.
rustdoc: hide `#[repr(transparent)]` if it isn't part of the public ABI
Fixes#90435.
This hides `#[repr(transparent)]` when the non-1-ZST field the struct is "transparent" over is private.
CC `@RalfJung`
Tentatively nominating it for the release notes, feel free to remove the nomination.
`@rustbot` label needs-fcp relnotes A-rustdoc-ui
rustdoc-search: add impl disambiguator to duplicate assoc items
Preview (to see the difference, click the link and pay attention to the specific function that comes up):
| Before | After |
|--|--|
| [`simd<i64>, simd<i64> -> simd<i64>`](https://doc.rust-lang.org/nightly/std/?search=simd%3Ci64%3E%2C%20simd%3Ci64%3E%20-%3E%20simd%3Ci64%3E) | [`simd<i64>, simd<i64> -> simd<i64>`](https://notriddle.com/rustdoc-demo-html-3/impl-disambiguate-search/std/index.html?search=simd%3Ci64%3E%2C%20simd%3Ci64%3E%20-%3E%20simd%3Ci64%3E) |
| [`cow, vec -> bool`](https://doc.rust-lang.org/nightly/std/?search=cow%2C%20vec%20-%3E%20bool) | [`cow, vec -> bool`](https://notriddle.com/rustdoc-demo-html-3/impl-disambiguate-search/std/index.html?search=cow%2C%20vec%20-%3E%20bool)
Helps with #90929
This changes the search results, specifically, when there's more than one impl with an associated item with the same name. For example, the search queries `simd<i8> -> simd<i8>` and `simd<i64> -> simd<i64>` don't link to the same function, but most of the functions have the same names.
This change should probably be FCP-ed, especially since it adds a new anchor link format for `main.js` to handle, so that URLs like `struct.Vec.html#impl-AsMut<[T]>-for-Vec<T,+A>/method.as_mut` redirect to `struct.Vec.html#method.as_mut-2`. It's a strange design, but there are a few reasons for it:
* I'd like to avoid making the HTML bigger. Obviously, fixing this bug is going to add at least a little more data to the search index, but adding more HTML penalises viewers for the benefit of searchers.
* Breaking `struct.Vec.html#method.len` would also be a disappointment.
On the other hand:
* The path-style anchors might be less prone to link rot than the numbered anchors. It's definitely less likely to have URLs that appear to "work", but silently point at the wrong thing.
* This commit arranges the path-style anchor to redirect to the numbered anchor. Nothing stops rustdoc from doing the opposite, making path-style anchors the default and redirecting the "legacy" numbered ones.
### The bug
On the "Before" links, this example search calls for `i64`:
![image](https://github.com/rust-lang/rust/assets/1593513/9431d89d-41dc-4f68-bbb1-3e2704a973d2)
But if I click any of the results, I get `f64` instead.
![image](https://github.com/rust-lang/rust/assets/1593513/6d89c692-1847-421a-84d9-22e359d9cf82)
The PR fixes this problem by adding enough information to the search result `href` to disambiguate methods with different types but the same name.
More detailed description of the problem at:
https://github.com/rust-lang/rust/pull/109422#issuecomment-1491089293
> When a struct/enum/union has multiple impls with different type parameters, it can have multiple methods that have the same name, but which are on different impls. Besides Simd, [Any](https://doc.rust-lang.org/nightly/std/any/trait.Any.html?search=any%3A%3Adowncast) also demonstrates this pattern. It has three methods named `downcast`, on three different impls.
>
> When that happens, it presents a challenge in linking to the method. Normally we link like `#method.foo`. When there are multiple `foo`, we number them like `#method.foo`, `#method.foo-1`, `#method.foo-2`, etc.
>
> It also presents a challenge for our search code. Currently we store all the variants in the index, but don’t have any way to generate unambiguous URLs in the results page, or to distinguish them in the SERP.
>
> To fix this, we need three things:
>
> 1. A fragment format that fully specifies the impl type parameters when needed to disambiguate (`#impl-SimdOrd-for-Simd<i64,+LANES>/method.simd_max`)
> 2. A search index that stores methods with enough information to disambiguate the impl they were on.
> 3. A search results interface that can display multiple methods on the same type with the same name, when appropriate OR a disambiguation landing section on item pages?
>
> For reviewers: it can be hard to see the new fragment format in action since it immediately gets rewritten to the numbered form.
[rustdoc] Show enum discrimant if it is a C-like variant
Fixes https://github.com/rust-lang/rust/issues/101337.
We currently display values for associated constant items in traits:
![image](https://github.com/rust-lang/rust/assets/3050060/03e566ec-c670-47b4-8ca2-b982baa7a0f4)
And we also display constant values like [here](file:///home/imperio/rust/rust/build/x86_64-unknown-linux-gnu/doc/std/f32/consts/constant.E.html).
I think that for coherency, we should display values of C-like enum variants.
With this change, it looks like this:
![image](https://github.com/rust-lang/rust/assets/3050060/b53fbbe0-bdb1-4289-8537-f2dd4988e9ac)
As for the display of the constant value itself, I used what we already have to keep coherency.
We display the C-like variants value in the following scenario:
1. It is a C-like variant with a value set => all the time
2. It is a C-like variant without a value set: All other variants are C-like variants and at least one them has its value set.
Here is the result in code:
```rust
// Ax and Bx value will be displayed.
enum A {
Ax = 12,
Bx,
}
// Ax and Bx value will not be displayed
enum B {
Ax,
Bx,
}
// Bx value will not be displayed
enum C {
Ax(u32),
Bx,
}
// Bx value will not be displayed, Cx value will be displayed.
#[repr(u32)]
enum D {
Ax(u32),
Bx,
Cx = 12,
}
```
r? `@notriddle`
This should result in a layout for the actual standard library,
when built on CI, that looks like this:
_____
/ \ std
| R | 1.74.0-nightly
\_____/
(203c57dbe 2023-09-17)
Having the whole version as one string caused it to flex wrap,
because the sidebar isn't wide enough to fit the whole thing.
This commit changes the layout to something a bit less "look at my logo!!!111"
gigantic, and makes it clearer where clicking the logo will actually take you.
It also means the crate name is persistently at the top of the sidebar, even
when in a sub-item page, and clicking that name takes you back to the root.
| | Short crate name | Long crate name |
|---------|------------------|-----------------|
| Root | ![short-root] | ![long-root]
| Subpage | ![short-subpage] | ![long-subpage]
[short-root]: https://github.com/rust-lang/rust/assets/1593513/fe2ce102-d4b8-44e6-9f7b-68636a907f56
[short-subpage]: https://github.com/rust-lang/rust/assets/1593513/29501663-56c0-4151-b7de-d2637e167125
[long-root]: https://github.com/rust-lang/rust/assets/1593513/f6a385c0-b4c5-4a9c-954b-21b38de4192f
[long-subpage]: https://github.com/rust-lang/rust/assets/1593513/97ec47b4-61bf-4ebe-b461-0d2187b8c6cahttps://notriddle.com/rustdoc-html-demo-4/logo-lockup/image/index.htmlhttps://notriddle.com/rustdoc-html-demo-4/logo-lockup/crossbeam_channel/index.htmlhttps://notriddle.com/rustdoc-html-demo-4/logo-lockup/adler/struct.Adler32.htmlhttps://notriddle.com/rustdoc-html-demo-4/logo-lockup/crossbeam_channel/struct.Sender.html
This improves visual information density (the construct with the logo and
crate name is *shorter* than the logo on its own, because it's not
square) and navigation clarity (we can now see what clicking the Rust logo
does, specifically).
Compare this with the layout at [Phoenix's Hexdocs] (which is what this
proposal is closely based on), the old proposal on [Internals Discourse]
(which always says "Rust standard library" in the sidebar, but doesn't do the
side-by-side layout).
[Phoenix's Hexdocs]: https://hexdocs.pm/phoenix/1.7.7/overview.html
[Internals Discourse]: https://internals.rust-lang.org/t/poc-of-a-new-design-for-the-generated-rustdoc/11018
In newer versions of rustdoc, the crate name and version are always shown in
the sidebar, even in subpages. Clicking the crate name does the same thing
clicking the logo always did: return you to the crate root.
While this actually takes up less screen real estate than the old layout on
desktop, it takes up more HTML. It's also a bit more visually complex.
I could do what the Internals POC did and keep the vertically stacked layout
all the time, instead of doing a horizontal stack where possible. It would
take up more screen real estate, though.
This design is lifted almost verbatim from Hexdocs. It seems to work for them.
[`opentelemetry_process_propagator`], for example, has a long application name.
[`opentelemetry_process_propagator`]: https://hexdocs.pm/opentelemetry_process_propagator/OpentelemetryProcessPropagator.html
Has anyone written the rationale on why the Rust logo shows up on projects that
aren't the standard library? If we turned it off on non-standard crates by
default, it would line wrap crate names a lot less often.
Or maybe we should encourage crate authors to include their own logo more
often? It certainly helps give people a better sense of "place."
I'm not sure of anything that directly follows up this one. Plenty of other
changes could be made to improve the layout, like
* coming up with a less cluttered way to do disclosure (there's a lot of `[-]`
on the page)
* doing a better job of separating lateral navigation (vec::Vec links to
vec::IntoIter) and the table of contents (vec::Vec links to vec::Vec::new)
* giving readers more control of how much rustdoc hows them, and giving doc
authors more control of how much it generates
* better search that reduces the need to browse
But those are mostly orthogonal, not future possibilities unlocked by this change.
rustdoc: fix & clean up handling of cross-crate higher-ranked parameters
Preparatory work for the refactoring planned in #113015 (for correctness & maintainability).
---
1. Render the higher-ranked parameters of cross-crate function pointer types **(*)**.
2. Replace occurrences of `collect_referenced_late_bound_regions()` (CRLBR) with `bound_vars()`.
The former is quite problematic and the use of the latter allows us to yank a lot of hacky code **(†)**
as you can tell from the diff! :)
3. Add support for cross-crate higher-ranked types (`#![feature(non_lifetime_binders)]`).
We were previously ICE'ing on them (see `inline_cross/non_lifetime_binders.rs`).
---
**(*)**: Extracted from test `inline_cross/fn-type.rs`:
```diff
- fn(_: &'z fn(_: &'b str), _: &'a ()) -> &'a ()
+ for<'z, 'a, '_unused> fn(_: &'z for<'b> fn(_: &'b str), _: &'a ()) -> &'a ()
```
**(†)**: It returns an `FxHashSet` which isn't *predictable* or *stable* wrt. source code (`.rmeta`) changes. To elaborate, the ordering of late-bound regions doesn't necessarily reflect the ordering found in the source code. It does seem to be stable across compilations but modifying the source code of the to-be-documented crates (like adding or renaming items) may result in a different order:
<details><summary>Example</summary>
Let's assume that we're documenting the cross-crate re-export of `produce` from the code below. On `master`, rustdoc would render the list of binders as `for<'x, 'y, 'z>`. However, once you add back the functions `a`–`l`, it would be rendered as `for<'z, 'y, 'x>` (reverse order)! Results may vary. `bound_vars()` fixes this as it returns them in source order.
```rs
// pub fn a() {}
// pub fn b() {}
// pub fn c() {}
// pub fn d() {}
// pub fn e() {}
// pub fn f() {}
// pub fn g() {}
// pub fn h() {}
// pub fn i() {}
// pub fn j() {}
// pub fn k() {}
// pub fn l() {}
pub fn produce() -> impl for<'x, 'y, 'z> Trait<'z, 'y, 'x> {}
pub trait Trait<'a, 'b, 'c> {}
impl Trait<'_, '_, '_> for () {}
```
</details>
Further, as the name suggests, CRLBR only collects *referenced* regions and thus we drop unused binders. `bound_vars()` contains unused binders on the other hand. Let's stay closer to the source where possible and keep unused binders.
Lastly, using `bound_vars()` allows us to get rid of
* the deduplication and alphabetical sorting hack in `simplify.rs`
* the weird field `bound_params` on `EqPredicate`
both of which were introduced by me in #102707 back when I didn't know better.
To illustrate, let's look at the cross-crate bound `T: for<'a, 'b> Trait<A<'a> = (), B<'b> = ()>`.
* With CRLBR + `EqPredicate.bound_params`, *before* bounds simplification we would have the bounds `T: Trait`, `for<'a> <T as Trait>::A<'a> == ()` and `for<'b> <T as Trait>::B<'b> == ()` which required us to merge `for<>`, `for<'a>` and `for<'b>` into `for<'a, 'b>` in a deterministic manner and without introducing duplicate binders.
* With `bound_vars()`, we now have the bounds `for<'a, b> T: Trait`, `<T as Trait>::A<'a> == ()` and `<T as Trait>::B<'b> == ()` before bound simplification similar to rustc itself. This obviously no longer requires any funny merging of `for<>`s. On top of that `for<'a, 'b>` is guaranteed to be in source order.
This whole thing changes it so that the JS and the UI both use
rustc's own path printing to handle the impl IDs. This results in
the format changing a little bit; full paths are used in spots
where they aren't strictly necessary, and the path sometimes uses
generics where the old system used the trait's own name, but it
shouldn't matter since the orphan rules will prevent it anyway.
Helps with #90929
This changes the search results, specifically, when there's more than
one impl with an associated item with the same name. For example,
the search queries `simd<i8> -> simd<i8>` and `simd<i64> -> simd<i64>`
don't link to the same function, but most of the functions have the
same names.
This change should probably be FCP-ed, especially since it adds a new
anchor link format for `main.js` to handle, so that URLs like
`struct.Vec.html#impl-AsMut<[T]>-for-Vec<T,+A>/method.as_mut` redirect
to `struct.Vec.html#method.as_mut-2`. It's a strange design, but there
are a few reasons for it:
* I'd like to avoid making the HTML bigger. Obviously, fixing this bug
is going to add at least a little more data to the search index, but
adding more HTML penalises viewers for the benefit of searchers.
* Breaking `struct.Vec.html#method.len` would also be a disappointment.
On the other hand:
* The path-style anchors might be less prone to link rot than the numbered
anchors. It's definitely less likely to have URLs that appear to "work",
but silently point at the wrong thing.
* This commit arranges the path-style anchor to redirect to the numbered
anchor. Nothing stops rustdoc from doing the opposite, making path-style
anchors the default and redirecting the "legacy" numbered ones.