Fix unsound behaviour with null characters in thread names (issue #32475)
Previously, the thread name (&str) was converted to a CString in the
new thread, but outside unwind::try, causing a panic to continue into FFI.
This patch changes that behaviour, so that the panic instead happens
in the parent thread (where panic infrastructure is properly set up),
not the new thread.
This could potentially be a breaking change for architectures who don't
support thread names.
Hardcode accepting 0 as a valid str char boundary
If we check explicitly for index == 0, that removes the need to read the
byte at index 0, so it avoids a trip to the string's memory, and it
optimizes out the slicing index' bounds check whenever it is (a constant) zero.
Remove ungrammatical dots from the error index.
They were probably meant as a shorthand for omitted code.
Part of #32446 but there should be a separate fix for the issue.
Restrict constants in patterns
This implements [RFC 1445](https://github.com/rust-lang/rfcs/blob/master/text/1445-restrict-constants-in-patterns.md). The primary change is to limit the types of constants used in patterns to those that *derive* `Eq` (note that implementing `Eq` is not sufficient). This has two main effects:
1. Floating point constants are linted, and will eventually be disallowed. This is because floating point constants do not implement `Eq` but only `PartialEq`. This check replaces the existing special case code that aimed to detect the use of `NaN`.
2. Structs and enums must derive `Eq` to be usable within a match.
This is a [breaking-change]: if you encounter a problem, you are most likely using a constant in an expression where the type of the constant is some struct that does not currently implement
`Eq`. Something like the following:
```rust
struct SomeType { ... }
const SOME_CONST: SomeType = ...;
match foo {
SOME_CONST => ...
}
```
The easiest and most future compatible fix is to annotate the type in question with `#[derive(Eq)]` (note that merely *implementing* `Eq` is not enough, it must be *derived*):
```rust
struct SomeType { ... }
const SOME_CONST: SomeType = ...;
match foo {
SOME_CONST => ...
}
```
Another good option is to rewrite the match arm to use an `if` condition (this is also particularly good for floating point types, which implement `PartialEq` but not `Eq`):
```rust
match foo {
c if c == SOME_CONST => ...
}
```
Finally, a third alternative is to tag the type with `#[structural_match]`; but this is not recommended, as the attribute is never expected to be stabilized. Please see RFC #1445 for more details.
cc https://github.com/rust-lang/rust/issues/31434
r? @pnkfelix
resolve: Minimize hacks in name resolution of primitive types
When resolving the first unqualified segment in a path with `n` segments and `n - 1` associated item segments, e.g. (`a` or `a::assoc` or `a::assoc::assoc` etc) try to resolve `a` without considering primitive types first. If the "normal" lookup fails or results in a module, then try to resolve `a` as a primitive type as a fallback.
This way backward compatibility is respected, but the restriction from E0317 can be lifted, i.e. primitive names mostly can be shadowed like any other names.
Furthermore, if names of primitive types are [put into prelude](https://github.com/petrochenkov/rust/tree/prim2) (now it's possible to do), then most of names will be resolved in conventional way and amount of code relying on this fallback will be greatly reduced. Although, it's not entirely convenient to put them into prelude right now due to temporary conflicts like `use prelude::v1::*; use usize;` in libcore/libstd, I'd better wait for proper glob shadowing before doing it.
I wish the `no_prelude` attribute were unstable as intended :(
cc @jseyfried @arielb1
r? @eddyb
Revamp symbol names for impls (and make them deterministic, etc)
This builds on @michaelwoerister's epic PR #31539 (note that his PR never landed, so I just incorporated it into this one). The main change here is that we remove the "name" from `DefPathData` for impls, since that name is synthetic and not sufficiently predictable for incr comp. However, just doing that would cause bad symbol names since those are based on the `DefPath`. Therefore, I introduce a new mechanism for getting symbol names (and also paths for user display) called `item_path`. This is kind of simplistic for now (based on strings) but I expect to expand it later to support richer types, hopefully generating C++-mangled names that gdb etc can understand. Along the way I cleaned up how we track the path that leads to an extern crate.
There is still some cleanup left undone here. Notably, I didn't remove the impl names altogether -- that would probably make sense. I also didn't try to remove the `item_symbols` vector. Mostly I want to unblock my other incr. comp. work. =)
r? @eddyb
cc @eddyb @alexcrichton @michaelwoerister
resolve: Refactor how the prelude is handled
This PR refactors how the prelude is handled in `resolve`.
Instead of importing names from the prelude into each module's `resolutions`, this PR adds a new field `prelude: RefCell<Option<Module>>` to `ModuleS` that is set during import resolution but used only when resolving in a lexical scope (i.e. the scope of an initial segment of a relative path).
r? @nikomatsakis
As discussed in
https://github.com/rust-lang/rust/pull/32293#issuecomment-200597130,
adding link guards are a heuristic that is causing undue complications:
- the link guards inject extra public symbols, which is not always OK.
- link guards as implemented could be a non-trivial performance hit,
because no attempt is made to "de-duplicate" the dependency graph,
so at worst you have O(N!) calls to the link guard functions.
Nonetheless, link guards are very helpful in detecting errors, so it may
be worth adding them back in some modified form in the future.
Link guards cause problems in some specific scenarios on windows because
they force libcore to be instantiated, since we do not GC functions
effectively on windows.
The changes here are two:
1. disable core for rsbegin/rsend
2. make panic_fmt an extern fn for smallest-hello-world so that it
is not marked as "internal" for LLVM
Full binary reproducible builds are not possible on all platforms
because linker injects a certain amount of randomness, apparently. Or,
at minimum, they don't work reliably yet.
This change has a few parts. We introduce a new `item_path` module for
constructing item paths. The job of this module is basically to make
nice, user-readable paths -- but these paths are not necessarily 100%
unique. They meant to help a *human* find code, but not necessarily a
compute. These paths are used to drive `item_path_str` but also symbol
names.
Because the paths are not unique, we also modify the symbol name hash to
include the full `DefPath`, whereas before it included only those
aspects of the def-path that were not included in the "informative"
symbol name.
Eventually, I'd like to make the item-path infrastructure a bit more
declarative. Right now it's based purely on strings. In particular, for
impls, we should supply the raw types to the `ItemPathBuffer`, so that
symbol names can be encoded using the C++ encoding scheme for better
integration with tooling.
We used to track, for each crate, a path that led to the extern-crate
that imported it. Instead of that, track the def-id of the extern crate,
along with a bit more information, and derive the path on the fly.
In particular, remove the name from the Impl, since that name is
synthesized and is not predictable (it tends to break incr. comp.).
Also rename the variants to be a bit more uniform and remove some
distinctions that we were not really taking advantage of anywhere.
We want to prevent compiling something against one version
of a dynamic library and then, at runtime accidentally
using a different version of the dynamic library. With the
old symbol-naming scheme this could not happen because every
symbol had the SVH in it and you'd get an error by the
dynamic linker when using the wrong version of a dylib. With
the new naming scheme this isn't the case any more, so this
patch adds the "link-guard" to prevent this error case.
This is implemented as follows:
- In every crate that we compile, we emit a function called
"__rustc_link_guard_<crate-name>_<crate-svh>"
- The body of this function contains calls to the
"__rustc_link_guard" functions of all dependencies.
- An executable contains a call to it's own
"__rustc_link_guard" function.
As a consequence the "__rustc_link_guard" function call graph
mirrors the crate graph and the dynamic linker will fail if a
wrong dylib is loaded somewhere because its
"__rustc_link_guard" function will contain a different SVH in
its name.