new rules for merging expected and supplied types in closure signatures
As uncovered in #38714, we currently have some pretty bogus code for combining the "expected signature" of a closure with the "supplied signature". To set the scene, consider a case like this:
```rust
fn foo<F>(f: F)
where
F: for<'a> FnOnce(&'a u32) -> &'a u32
// ^ *expected* signature comes from this where-clause
{
...
}
fn main() {
foo(|x: &u32| -> &u32 { .. }
// ^^^^^^^^^^^^^^^^^ supplied signature
// comes from here
}
```
In this case, the supplied signature (a) includes all the parts and (b) is the same as the expected signature, modulo the names used for the regions. But often people supply only *some* parts of the signature. For example, one might write `foo(|x| ..)`, leaving *everything* to be inferred, or perhaps `foo(|x: &u32| ...)`, which leaves the return type to be inferred.
In the current code, we use the expected type to supply the types that are not given, but otherwise use the type the user gave, except for one case: if the user writes `fn foo(|x: _| ..)` (i.e., an underscore at the outermost level), then we will take the expected type (rather than instantiating a fresh type variable). This can result in nonsensical situations, particularly with bound regions that link the types of parameters to one another or to the return type. Consider `foo(|x: &u32| ...)` -- if we *literally* splice the expected return type of `&'a u32` together with what the user gave, we wind up with a signature like `for<'a> fn(&u32) -> &'a u32`. This is not even permitted as a type, because bound regions like `'a` must appear also in the arguments somewhere, which is why #38714 leads to an ICE.
This PR institutes some new rules. These are not meant to be the *final* set of rules, but they are a kind of "lower bar" for what kind of code we accept (i.e., we can extend these rules in the future to be smarter in some cases, but -- as we will see -- these rules do accept some things that we then would not be able to back off from).
These rules are derived from a few premises:
- First and foremost, anonymous regions in closure annotation are mostly requests for the code to "figure out the right lifetime" and shouldn't be read too closely. So for example when people write a closure signature like `|x: &u32|`, they are really intended for us to "figure out" the right region for `x`.
- In contrast, the current code treats this supplied type as being more definitive. In particular, writing `|x: &u32|` would always result in the region of `x` being bound in the closure type. In other words, the signature would be something like `for<'a> fn(&'a u32)` -- this is derived from the fact that `fn(&u32)` expands to a type where the region is bound in the fn type.
- This PR takes a different approach. The "binding level" for reference types appearing in closure signatures can be informed in some cases by the expected signature. So, for example, if the expected signature is something like `(&'f u32)`, where the region of the first argument appears free, then for `|x: &u32|`, the new code would infer `x` to also have the free region `'f`.
- This inference has some limits. We don't do this for bindings that appear within the selected types themselves. So e.g. `|x: fn(&u32)|`, when combined with an expected type of `fn(fn(&'f u32))`, would still result in a closure that expects `for<'a> fn(&'a u32)`. Such an annotation will ultimately result in an error, as it happens, since `foo` is supplying a `fn(&'f u32)` to the closure, but the closure signature demands a `for<'a> fn(&'a u32)`. But still we choose to trust it and have the user change it.
- I wanted to preserve the rough intuition that one can copy-and-paste a type out of the fn signature and into the fn body without dramatically changing its meaning. Interestingly, if one has `|x: &u32|`, then regardless of whether the region of `x` is bound or free in the closure signature, it is also free in the region body, and that is also true when one writes `let x: &u32`, so that intuition holds here. But the same would not be true for `fn(&u32)`, hence the different behavior.
- Second, we must take either **all** the references to bound regions from the expected type or **none**. The current code, as we saw, will happily take a bound region in the return type but drop the other place where it is used, in the parameters. Since bound regions are all about linking multiple things together, I think it's important not to do that. (That said, we could conceivably be a bit less strict here, since the subtyping rules will get our back, but we definitely don't want any bound regions that appear only in the return type.)
- Finally, we cannot take the bound region names from the supplied types and "intermix" them with the names from the expected types.
- We *could* potentially do some alpha renaming, but I didn't do that.
- Ultimately, if the types the user supplied do not match expectations in some way that we cannot recover from, we fallback to deriving the closure signature solely from those expected types.
- For example, if the expected type is `u32` but the user wrote `i32`.
- Or, more subtle, if the user wrote e.g. `&'x u32` for some named lifetime `'x`, but the expected type includes a bound lifetime (`for<'a> (&'a u32)`). In that case, preferring the type that the user explicitly wrote would hide an appearance of a bound name from the expected type, and we try to never do that.
The detailed rules that I came up with are found in the code, but for ease of reading I've also [excerpted them into a gist](https://gist.github.com/nikomatsakis/e69252a2b57e6d97d044c2f254c177f1). I am not convinced they are correct and would welcome feedback for alternative approaches.
(As an aside, the way I think I would ultimately *prefer* to think about this is that the conversion from HIR types to internal types could be parameterized by an "expected type" that it uses to guide itself. However, since that would be a pain, I opted *in the code* to first instantiate the supplied types as `Ty<'tcx>` and then "merge" those types with the `Ty<'tcx>` from the expected signature.)
I think we should probably FCP this before landing.
cc @rust-lang/lang
r? @arielb1
Copy all `AsciiExt` methods to the primitive types directly in order to deprecate it later
**EDIT:** [this PR is ready now](https://github.com/rust-lang/rust/pull/44042#issuecomment-333883548). I edited this post to reflect the current status of discussion, which is (apart from code review) pretty much settled.
---
This is my current progress in order to prepare stabilization of #39658. As discussed there (and in #39659), the idea is to deprecated `AsciiExt` and copy all methods to the type directly. Apparently there isn't really a reason to have those methods in an extension trait¹.
~~This is **work in progress**: copy&pasting code while slightly modifying the documentation isn't the most exciting thing to do. Therefore I wanted to already open this WIP PR after doing basically 1/4 of the job (copying methods to `&[u8]`, `char` and `&str` is still missing) to get some feedback before I continue. Some questions possibly worth discussing:~~
1. ~~Does everyone agree that deprecating `AsciiExt` is a good idea? Does everyone agree with the goal of this PR?~~ => apparently yes
2. ~~Are my changes OK so far? Did I do something wrong?~~
3. ~~The issue of the unstable-attribute is currently set to 0. I would wait until you say "Ok" to the whole thing, then create a tracking issue and then insert the correct issue id. Is that ok?~~
4. ~~I tweaked `eq_ignore_ascii_case()`: it now takes the argument `other: u8` instead of `other: &u8`. The latter was enforced by the trait. Since we're not bound to a trait anymore, we can drop the reference, ok?~~ => I reverted this, because the interface has to match the `AsciiExt` interface exactly.
¹ ~~Could it be that we can't write `impl [u8] {}`? This might be the reason for `AsciiExt`. If that is the case: is there a good reason we can't write such an impl block? What can we do instead?~~ => we couldn't at the time this PR was opened, but Simon made it possible.
/cc @SimonSapin @zackw
Otherwise changes to the compiler are unable to introduce new
warnings: some crates tested by cargotest deny all warnings and
thus, the CI build fails.
Thanks SimonSapin for the patch!
Fix#18604: next_power_of_two should panic on overflow
Fixes https://github.com/rust-lang/rust/issues/18604
Is it possible to write a test for this? My experiments showed `x.py test` running in release mode, so my attempt at a `#[should_panic]` didn't work.
Shorten paths to auxiliary files created by tests
I'm hitting issues with long file paths to object files created by the test suite, similar to https://github.com/rust-lang/rust/issues/45103#issuecomment-335622075.
If we look at the object file path in https://github.com/rust-lang/rust/issues/45103 we can see that the patch contains of few components:
```
specialization-cross-crate-defaults.stage2-x86_64-pc-windows-gnu.run-pass.libaux\specialization_cross_crate_defaults.specialization_cross_crate_defaults0.rust-cgu.o
```
=>
1. specialization-cross-crate-defaults // test name, required
2. stage2 // stage disambiguator, required
3. x86_64-pc-windows-gnu // target disambiguator, required
4. run-pass // mode disambiguator, rarely required
5. libaux // suffix, can be shortened
6. specialization_cross_crate_defaults // required, there may be several libraries in the directory
7. specialization_cross_crate_defaults0 // codegen unit name, can be shortened?
8. rust-cgu // suffix, can be shortened?
9. o // object file extension
This patch addresses items `4`, `5` and `8`.
`libaux` is shortened to `aux`, `rust-cgu` is shortened to `rcgu`, mode disambiguator is omitted unless it's necessary (for pretty-printing and debuginfo tests, see 38d26d811a)
I haven't touched names of codegen units though (`specialization_cross_crate_defaults0`).
Is it useful for them to have descriptive names including the crate name, as opposed to just `0` or `cgu0` or something?
rustc: Handle some libstd symbole exports better
Right now symbol exports, particularly in a cdylib, are handled by
assuming that `pub extern` combined with `#[no_mangle]` means "export
this". This isn't actually what we want for some symbols that the
standard library uses to implement itself, for example symbols related
to allocation. Additionally other special symbols like
`rust_eh_personallity` have no need to be exported from cdylib crate
types (only needed in dylib crate types).
This commit updates how rustc handles these special symbols by adding to
the hardcoded logic of symbols like `rust_eh_personallity` but also
adding a new attribute, `#[rustc_std_internal_symbol]`, which forces the
export level to be considered the same as all other Rust functions
instead of looking like a C function.
The eventual goal here is to prevent functions like `__rdl_alloc` from
showing up as part of a Rust cdylib as it's just an internal
implementation detail. This then further allows such symbols to get gc'd
by the linker when creating a cdylib.
Right now symbol exports, particularly in a cdylib, are handled by
assuming that `pub extern` combined with `#[no_mangle]` means "export
this". This isn't actually what we want for some symbols that the
standard library uses to implement itself, for example symbols related
to allocation. Additionally other special symbols like
`rust_eh_personallity` have no need to be exported from cdylib crate
types (only needed in dylib crate types).
This commit updates how rustc handles these special symbols by adding to
the hardcoded logic of symbols like `rust_eh_personallity` but also
adding a new attribute, `#[rustc_std_internal_symbol]`, which forces the
export level to be considered the same as all other Rust functions
instead of looking like a C function.
The eventual goal here is to prevent functions like `__rdl_alloc` from
showing up as part of a Rust cdylib as it's just an internal
implementation detail. This then further allows such symbols to get gc'd
by the linker when creating a cdylib.
RFC 2008: Future-proofing enums/structs with #[non_exhaustive] attribute
This work-in-progress pull request contains my changes to implement [RFC 2008](https://github.com/rust-lang/rfcs/pull/2008). The related tracking issue is #44109.
As of writing, enum-related functionality is not included and there are some issues related to tuple/unit structs. Enum related tests are currently ignored.
WIP PR requested by @nikomatsakis [in Gitter](https://gitter.im/rust-impl-period/WG-compiler-middle?at=59e90e6297cedeb0482ade3e).
Add derive and doc comment capabilities to newtype_index macro
This moves `RustcDecodable` and `RustcEncodable` out of the macro definition and into the macro uses. They were conflicting with `CrateNum`'s impls of `serialize::UseSpecializedEncodable` and `serialize::UseSpecializedDecodable`, and now it's not :). `CrateNum` is now defined with the `newtype_index` macro. I also added support for doc comments on constant definitions and allowed a type to remove the pub specification on the tuple param (otherwise a LOT of code would refuse to compile for `CrateNum`). I was getting dozens of errors like this if `CrateNum` was defined as `pub struct CrateNum(pub u32)`:
```
error[E0530]: match bindings cannot shadow tuple structs
--> src/librustc/dep_graph/dep_node.rs:624:25
|
63 | use hir::def_id::{CrateNum, DefId, DefIndex, CRATE_DEF_INDEX};
| -------- a tuple struct `CrateNum` is imported here
...
624 | [] MissingLangItems(CrateNum),
| ^^^^^^^^ cannot be named the same as a tuple struct
```
I also cleaned up the formatting of the macro bodies as they were getting impossibly long. Should I go back and fix the matching rules to this style too?
I also want to see what the test results look like because `CrateNum` used to just derive `Debug`, but the `newtype_index` macro has a custom implementation. This might require further pushes.
Feel free to bikeshed on the macro language, I have no preference here.
Fix libstd compile error for windows-gnu targets without `backtrace`
This is basically an addition to #44979. Compiling `libstd` still fails when targeting `windows-gnu` with `panic = "abort"` because the items in the `...c::gnu` module are not used. They are only referenced from `backtrace_gnu.rs`, which is indirectly feature gated behind `backtrace` [here](9f3b09116b/src/libstd/sys/windows/mod.rs (L23)).
Add a nicer error message for missing in for loop, fixes#40782.
As suggested by @estebank in issue #40782, this works in the same way as #42578: if the in keyword is missing, we continue parsing the expression and if this works correctly an adapted error message is produced. Otherwise we return the old error.
A specific test case has also been added.
This is my first PR on rust-lang/rust so any feedback is very welcome.
add TerminatorKind::FalseEdges and use it in matches
impl #45184 and fixes#45043 right way.
False edges unexpectedly affects uninitialized variables analysis in MIR borrowck.
We don't want to stabilize them now already. The goal of this set of
commits is just to add inherent methods to the four types. Stabilizing
all of those methods can be done later.
Many AsciiExt imports have become useless thanks to the inherent ascii
methods added in the last commits. These were removed. In some places, I
fully specified the ascii method being called to enforce usage of the
AsciiExt trait. Note that some imports are not removed but tagged with
a `#[cfg(stage0)]` attribute. This is necessary, because certain ascii
methods are not yet available in stage0. All those imports will be
removed later.
Additionally, failing tests were fixed. The test suite should exit
successfully now.
This is done in order to deprecate AsciiExt eventually. Note that
this commit contains a bunch of `cfg(stage0)` statements. This is
due to a new compiler feature this commit depends on: the
`slice_u8` lang item. Once this lang item is available in the
stage0 compiler, all those cfg flags (and more) can be removed.
This is done in order to deprecate AsciiExt eventually. Note that
this commit contains a bunch of `cfg(stage0)` statements. This is
due to a new compiler feature I am using: the `slice_u8` lang item.
Once this lang item is available in the stage0 compiler, all those
cfg flags (and more) can be removed.
The doc comments were incorrect before: since the inherent ascii methods
shadow the `AsciiExt` methods, the examples didn't use the `AsciiExt` at
all. Since the trait will be deprecated soon anyway, the easiest solution
was to remove the examples and already mention that the methods will be
deprecated in the near future.
Since the methods on u8 directly will shadow the AsciiExt methods,
we cannot change the signature without breaking everything. It
would have been nice to take `u8` as argument instead of `&u8`, but
we cannot break stuff! So this commit reverts it to the original
`&u8` version.
Those methods will shadow the methods of `AsciiExt`, so if we don't
make them insta-stable, everyone will hitting stability errors. It
is fine adding those as stable, because they are just being moved
around [according to sfackler][1].
OPEN QUESTION: this commit also stabilizes the `AsciiExt` methods
that were previously feature gated by the `ascii_ctype` feature.
Maybe we don't want to stablilize those yet.
[1]: https://github.com/rust-lang/rust/pull/44042#issuecomment-329939279
This is the first step in order to deprecate AsciiExt. Since
this is a WIP commit, there is still some code duplication (notably
the static arrays) that will be removed later.