Accept additional user-defined syntax classes in fenced code blocks
Part of #79483.
This is a re-opening of https://github.com/rust-lang/rust/pull/79454 after a big update/cleanup. I also converted the syntax to pandoc as suggested by `@notriddle:` the idea is to be as compatible as possible with the existing instead of having our own syntax.
## Motivation
From the original issue: https://github.com/rust-lang/rust/issues/78917
> The technique used by `inline-c-rs` can be ported to other languages. It's just super fun to see C code inside Rust documentation that is also tested by `cargo doc`. I'm sure this technique can be used by other languages in the future.
Having custom CSS classes for syntax highlighting will allow tools like `highlight.js` to be used in order to provide highlighting for languages other than Rust while not increasing technical burden on rustdoc.
## What is the feature about?
In short, this PR changes two things, both related to codeblocks in doc comments in Rust documentation:
* Allow to disable generation of `language-*` CSS classes with the `custom` attribute.
* Add your own CSS classes to a code block so that you can use other tools to highlight them.
#### The `custom` attribute
Let's start with the new `custom` attribute: it will disable the generation of the `language-*` CSS class on the generated HTML code block. For example:
```rust
/// ```custom,c
/// int main(void) {
/// return 0;
/// }
/// ```
```
The generated HTML code block will not have `class="language-c"` because the `custom` attribute has been set. The `custom` attribute becomes especially useful with the other thing added by this feature: adding your own CSS classes.
#### Adding your own CSS classes
The second part of this feature is to allow users to add CSS classes themselves so that they can then add a JS library which will do it (like `highlight.js` or `prism.js`), allowing to support highlighting for other languages than Rust without increasing burden on rustdoc. To disable the automatic `language-*` CSS class generation, you need to use the `custom` attribute as well.
This allow users to write the following:
```rust
/// Some code block with `{class=language-c}` as the language string.
///
/// ```custom,{class=language-c}
/// int main(void) {
/// return 0;
/// }
/// ```
fn main() {}
```
This will notably produce the following HTML:
```html
<pre class="language-c">
int main(void) {
return 0;
}</pre>
```
Instead of:
```html
<pre class="rust rust-example-rendered">
<span class="ident">int</span> <span class="ident">main</span>(<span class="ident">void</span>) {
<span class="kw">return</span> <span class="number">0</span>;
}
</pre>
```
To be noted, we could have written `{.language-c}` to achieve the same result. `.` and `class=` have the same effect.
One last syntax point: content between parens (`(like this)`) is now considered as comment and is not taken into account at all.
In addition to this, I added an `unknown` field into `LangString` (the parsed code block "attribute") because of cases like this:
```rust
/// ```custom,class:language-c
/// main;
/// ```
pub fn foo() {}
```
Without this `unknown` field, it would generate in the DOM: `<pre class="language-class:language-c language-c">`, which is quite bad. So instead, it now stores all unknown tags into the `unknown` field and use the first one as "language". So in this case, since there is no unknown tag, it'll simply generate `<pre class="language-c">`. I added tests to cover this.
Finally, I added a parser for the codeblock attributes to make it much easier to maintain. It'll be pretty easy to extend.
As to why this syntax for adding attributes was picked: it's [Pandoc's syntax](https://pandoc.org/MANUAL.html#extension-fenced_code_attributes). Even if it seems clunkier in some cases, it's extensible, and most third-party Markdown renderers are smart enough to ignore Pandoc's brace-delimited attributes (from [this comment](https://github.com/rust-lang/rust/pull/110800#issuecomment-1522044456)).
## Raised concerns
#### It's not obvious when the `language-*` attribute generation will be added or not.
It is added by default. If you want to disable it, you will need to use the `custom` attribute.
#### Why not using HTML in markdown directly then?
Code examples in most languages are likely to contain `<`, `>`, `&` and `"` characters. These characters [require escaping](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/pre) when written inside the `<pre>` element. Using the \`\`\` code blocks allows rustdoc to take care of escaping, which means doc authors can paste code samples directly without manually converting them to HTML.
cc `@poliorcetics`
r? `@notriddle`
Rollup of 2 pull requests
Successful merges:
- #115607 (clarify that unsafe code must not rely on our safe traits)
- #115866 (make interpreter and TyAndLayout type Debug impl independent of Ty debug impl)
Failed merges:
- #115873 (Make `TyKind::Adt`'s `Debug` impl be more pretty)
- #115884 (make ty::Const debug printing less verbose)
r? `@ghost`
`@rustbot` modify labels: rollup
clarify that unsafe code must not rely on our safe traits
This adds a disclaimer to PartialEq, Eq, PartialOrd, Ord, Hash, Deref, DerefMut.
We already have a similar disclaimer in ExactSizeIterator (worded a bit differently):
```
/// Note that this trait is a safe trait and as such does *not* and *cannot*
/// guarantee that the returned length is correct. This means that `unsafe`
/// code **must not** rely on the correctness of [`Iterator::size_hint`]. The
/// unstable and unsafe [`TrustedLen`](super::marker::TrustedLen) trait gives
/// this additional guarantee.
```
If there are any other traits that should carry such a disclaimer, please let me know.
Fixes https://github.com/rust-lang/rust/issues/73682
closure field capturing: don't depend on alignment of packed fields
This fixes the closure field capture part of https://github.com/rust-lang/rust/issues/115305: field capturing always stops at projections into packed structs, no matter the alignment of the field. This means changing a private field type from `u8` to `u64` can never change how closures capture fields, which is probably what we want.
Here's an example where, before this PR, changing the type of a private field in a repr(Rust) struct can change the output of a program:
```rust
#![allow(dead_code)]
mod m {
// before patch
#[derive(Default)]
pub struct S1(u8);
// after patch
#[derive(Default)]
pub struct S2(u64);
}
struct NoisyDrop;
impl Drop for NoisyDrop {
fn drop(&mut self) {
eprintln!("dropped!");
}
}
#[repr(packed)]
struct MyType {
field: m::S1, // output changes when this becomes S2
other_field: NoisyDrop,
third_field: Vec<()>,
}
fn test(r: MyType) {
let c = || {
let _val = std::ptr::addr_of!(r.field);
let _val = r.third_field;
};
drop(c);
eprintln!("before dropping");
}
fn main() {
test(MyType {
field: Default::default(),
other_field: NoisyDrop,
third_field: Vec::new(),
});
}
```
Of course this is a breaking change for the same reason that doing field capturing in the first place was a breaking change. Packed fields are relatively rare and depending on drop order is relatively rare, so I don't expect this to have much impact, but it's hard to be sure and even a crater run will only tell us so much.
Also see the [nomination comment](https://github.com/rust-lang/rust/pull/115315#issuecomment-1702807825).
Cc `@rust-lang/wg-rfc-2229` `@ehuss`
Make useless_ptr_null_checks smarter about some std functions
This teaches the `useless_ptr_null_checks` lint that some std functions can't ever return null pointers, because they need to point to valid data, get references as input, etc.
This is achieved by introducing an `#[rustc_never_returns_null_ptr]` attribute and adding it to these std functions (gated behind bootstrap `cfg_attr`).
Later on, the attribute could maybe be used to tell LLVM that the returned pointer is never null. I don't expect much impact of that though, as the functions are pretty shallow and usually the input data is already never null.
Follow-up of PR #113657Fixes#114442