Add `CodeExtent::Remainder` variant; pre-req for new scoping/drop rules.
This new enum variant introduces finer-grain code extents, i.e. we now track that a binding lives only for a suffix of a block, and (importantly) will be dropped when it goes out of scope *before* the bindings that occurred earlier in the block.
Both of these notions are neatly captured by marking the block (and each suffix) as an enclosing scope of the next suffix beneath it.
This is work that is part of the foundation for issue #8861.
(It actually has been seen in earlier posted pull requests, in particular #21022; I have just factored it out into its own PR to ease my own near-future rebasing, and also get people used to the new rules.)
----
These finer grained scopes do mean that some code is newly rejected by `rustc`; for example:
```rust
let mut map : HashMap<u8, &u8> = HashMap::new();
let tmp = Box::new(2);
map.insert(43, &*tmp);
```
This will now fail to compile with a message that `*tmp` does not live long enough, because the scope of `tmp` is now strictly smaller than
that of `map`, and the use of `&u8` in map's type requires that the borrowed references are all to data that live at least as long as the map.
The usual fix for a case like this is to move the binding for `tmp` up above that of `map`; note that you can still leave the initialization in the original spot, like so:
```rust
let tmp;
let mut map : HashMap<u8, &u8> = HashMap::new();
tmp = box 2;
map.insert(43, &*tmp);
```
Similarly, one can encounter an analogous situation with `Vec`: one would need to rewrite:
```rust
let mut vec = Vec::new();
let tmp = 'c';
vec.push(&tmp);
```
as:
```rust
let tmp;
let mut vec = Vec::new();
tmp = 'c';
vec.push(&tmp);
```
----
In some corner cases, it does not suffice to reorder the bindings; in particular, when the types for both bindings need to reflect exactly the *same* code extent, and a parent/child relationship between them does not work.
In pnkfelix's experience this has arisen most often when mixing uses of cyclic data structures while also allowing a lifetime parameter `'a` to flow into a type parameter context where the type is *invariant* with respect to the type parameter. An important instance of this is `arena::TypedArena<T>`, which is invariant with respect to `T`.
(The reason that variance is relevant is this: *if* `TypedArena` were covariant with respect to its type parameter, then we could assign it
the longer lifetime when it is initialized, and then convert it to a subtype (via covariance) with a shorter lifetime when necessary. But `TypedArena` is invariant with respect to its type parameter, and thus if `S` is a subtype of `T` (in particular, if `S` has a lifetime parameter that is shorter than that of `T`), then a `TypedArena<S>` is unrelated to `TypedArena<T>`.)
Concretely, consider code like this:
```rust
struct Node<'a> { sibling: Option<&'a Node<'a>> }
struct Context<'a> {
// because of this field, `Context<'a>` is invariant with respect to `'a`.
arena: &'a TypedArena<Node<'a>>,
...
}
fn new_ctxt<'a>(arena: &'a TypedArena<Node<'a>>) -> Context<'a> { ... }
fn use_ctxt<'a>(fcx: &'a Context<'a>) { ... }
let arena = TypedArena::new();
let ctxt = new_ctxt(&arena);
use_ctxt(&ctxt);
```
In these situations, if you try to introduce two bindings via two distinct `let` statements, each is (with this commit) assigned a distinct extent, and the region inference system cannot find a single region to assign to the lifetime `'a` that works for both of the bindings. So you get an error that `ctxt` does not live long enough; but moving its binding up above that of `arena` just shifts the error so now the compiler complains that `arena` does not live long enough.
* SO: What to do? The easiest fix in this case is to ensure that the two bindings *do* get assigned the same static extent, by stuffing both
bindings into the same let statement, like so:
```rust
let (arena, ctxt): (TypedArena, Context);
arena = TypedArena::new();
ctxt = new_ctxt(&arena);
use_ctxt(&ctxt);
```
----
Due to the new code restrictions outlined above, this is a ...
[breaking-change]
This new variant introduces finer-grain code extents, i.e. we now
track that a binding lives only for a suffix of a block, and
(importantly) will be dropped when it goes out of scope *before* the
bindings that occurred earlier in the block.
Both of these notions are neatly captured by marking the block (and
each suffix) as an enclosing scope of the next suffix beneath it.
This is work that is part of the foundation for issue #8861.
(It actually has been seen in earlier posted pull requests; I have
just factored it out into its own PR to ease my own rebasing.)
----
These finer grained scopes do mean that some code is newly rejected by
`rustc`; for example:
```rust
let mut map : HashMap<u8, &u8> = HashMap::new();
let tmp = Box::new(2);
map.insert(43, &*tmp);
```
This will now fail to compile with a message that `*tmp` does not live
long enough, because the scope of `tmp` is now strictly smaller than
that of `map`, and the use of `&u8` in map's type requires that the
borrowed references are all to data that live at least as long as the
map.
The usual fix for a case like this is to move the binding for `tmp`
up above that of `map`; note that you can still leave the initialization
in the original spot, like so:
```rust
let tmp;
let mut map : HashMap<u8, &u8> = HashMap::new();
tmp = box 2;
map.insert(43, &*tmp);
```
Similarly, one can encounter an analogous situation with `Vec`: one
would need to rewrite:
```rust
let mut vec = Vec::new();
let tmp = 'c';
vec.push(&tmp);
```
as:
```
let tmp;
let mut vec = Vec::new();
tmp = 'c';
vec.push(&tmp);
```
----
In some corner cases, it does not suffice to reorder the bindings; in
particular, when the types for both bindings need to reflect exactly
the *same* code extent, and a parent/child relationship between them
does not work.
In pnkfelix's experience this has arisen most often when mixing uses
of cyclic data structures while also allowing a lifetime parameter
`'a` to flow into a type parameter context where the type is
*invariant* with respect to the type parameter. An important instance
of this is `arena::TypedArena<T>`, which is invariant with respect
to `T`.
(The reason that variance is relevant is this: *if* `TypedArena` were
covariant with respect to its type parameter, then we could assign it
the longer lifetime when it is initialized, and then convert it to a
subtype (via covariance) with a shorter lifetime when necessary. But
`TypedArena` is invariant with respect to its type parameter, and thus
if `S` is a subtype of `T` (in particular, if `S` has a lifetime
parameter that is shorter than that of `T`), then a `TypedArena<S>` is
unrelated to `TypedArena<T>`.)
Concretely, consider code like this:
```rust
struct Node<'a> { sibling: Option<&'a Node<'a>> }
struct Context<'a> {
// because of this field, `Context<'a>` is invariant with respect to `'a`.
arena: &'a TypedArena<Node<'a>>,
...
}
fn new_ctxt<'a>(arena: &'a TypedArena<Node<'a>>) -> Context<'a> { ... }
fn use_ctxt<'a>(fcx: &'a Context<'a>) { ... }
let arena = TypedArena::new();
let ctxt = new_ctxt(&arena);
use_ctxt(&ctxt);
```
In these situations, if you try to introduce two bindings via two
distinct `let` statements, each is (with this commit) assigned a
distinct extent, and the region inference system cannot find a single
region to assign to the lifetime `'a` that works for both of the
bindings. So you get an error that `ctxt` does not live long enough;
but moving its binding up above that of `arena` just shifts the error
so now the compiler complains that `arena` does not live long enough.
SO: What to do? The easiest fix in this case is to ensure that the two
bindings *do* get assigned the same static extent, by stuffing both
bindings into the same let statement, like so:
```rust
let (arena, ctxt): (TypedArena, Context);
arena = TypedArena::new();
ctxt = new_ctxt(&arena);
use_ctxt(&ctxt);
```
Due to the new code rejections outlined above, this is a ...
[breaking-change]
Using `generic` as the target cpu limits the generated code to the bare basics for the arch, while we can probably assume that we'll actually be running on somewhat modern hardware. This updates the default target CPUs for the x86 and x86_64 archs to match clang's behaviour.
Refs #20777
In preparation for the I/O rejuvination of the standard library, this commit
renames the current `io` module to `old_io` in order to make room for the new
I/O modules. It is expected that the I/O RFCs will land incrementally over time
instead of all at once, and this provides a fresh clean path for new modules to
enter into as well as guaranteeing that all old infrastructure will remain in
place for some time.
As each `old_io` module is replaced it will be deprecated in-place for new
structures in `std::{io, fs, net}` (as appropriate).
This commit does *not* leave a reexport of `old_io as io` as the deprecation
lint does not currently warn on this form of use. This is quite a large breaking
change for all imports in existing code, but all functionality is retained
precisely as-is and path statements simply need to be renamed from `io` to
`old_io`.
[breaking-change]
In preparation for upcoming changes to the `Writer` trait (soon to be called
`Write`) this commit renames the current `write` method to `write_all` to match
the semantics of the upcoming `write_all` method. The `write` method will be
repurposed to return a `usize` indicating how much data was written which
differs from the current `write` semantics. In order to head off as much
unintended breakage as possible, the method is being deprecated now in favor of
a new name.
[breaking-change]
In preparation for the I/O rejuvination of the standard library, this commit
renames the current `io` module to `old_io` in order to make room for the new
I/O modules. It is expected that the I/O RFCs will land incrementally over time
instead of all at once, and this provides a fresh clean path for new modules to
enter into as well as guaranteeing that all old infrastructure will remain in
place for some time.
As each `old_io` module is replaced it will be deprecated in-place for new
structures in `std::{io, fs, net}` (as appropriate).
This commit does *not* leave a reexport of `old_io as io` as the deprecation
lint does not currently warn on this form of use. This is quite a large breaking
change for all imports in existing code, but all functionality is retained
precisely as-is and path statements simply need to be renamed from `io` to
`old_io`.
[breaking-change]
This ends up propagating all the way out to the output of dep-info which then
makes Cargo think that files are not existent (it thinks the files have quotes
in their name) when they in fact do.
Don't reallocate when capacity is already equal to length
`Vec::shrink_to_fit()` may be called on vectors that are already the
correct length. Calling out to `reallocate()` in this case is a bad idea
because there is no guarantee that `reallocate()` won't allocate a new
buffer anyway, and based on performance seen in external benchmarks, it
seems likely that it is in fact reallocating a new buffer.
Before:
test string::tests::bench_exact_size_shrink_to_fit ... bench: 45 ns/iter (+/- 2)
After:
test string::tests::bench_exact_size_shrink_to_fit ... bench: 26 ns/iter (+/- 1)
Limiting ourselves to a generic x86 instruction set doesn't seem useful.
Both users running those systems on original i386 hardware might as well
manually specify a target cpu ;-)
Clang uses the same default.
I'm beginning to suspect it's impossible to avoid accidentally writing
`#[deriving]` at least once in every program, and it results in
non-intuitive error messages: "Foo doesn't have any method in scope
`clone`" despite there being a `#[deriv...(Clone)]` attribute!
Also, lots of documentation around the internet uses `#[deriving]` so
providing this guidance is very helpful (lots of people ask in #rust
about this error).
As part of #20432, upvar checking is now moved out of regionck to its
own pass and before regionck. But regionck has some type resolution of
its own. Without them, now separated upvar checking may be tripped over
by residue `ty_infer`.
Closes#21306
Two minor improvements that have been on my TODO list for a while:
* Clang uses a size of `-1` for arrays of unknown size and we should do that too as it will tell LLVM to omit the size information in debuginfo.
* There was no LLDB test case for handling the optimized enum representation introduced by @luqmana. This PR finally adds one.
Before:
```
error: invalid operand for inline asm constraint 'i' at line 11
```
Note that 11 is not the line the inline assembly appears in.
After:
```
src/arch/x64/multiboot/bootstrap.rs:203:5: 209:9 error: invalid operand for inline asm constraint 'i'
src/arch/x64/multiboot/bootstrap.rs:203 asm! {
src/arch/x64/multiboot/bootstrap.rs:204 [multiboot => %ecx, mod attsyntax]
src/arch/x64/multiboot/bootstrap.rs:205
src/arch/x64/multiboot/bootstrap.rs:206 ljmp {size_of::<Descriptor>() => %i}, $bootstrap.64
src/arch/x64/multiboot/bootstrap.rs:207 }
src/arch/x64/multiboot/bootstrap.rs:208
...
error: aborting due to previous error
```
E.g. `fn foo() { foo() }`, or, more subtlely
impl Foo for Box<Foo+'static> {
fn bar(&self) {
self.bar();
}
}
The compiler will warn and point out the points where recursion occurs,
if it determines that the function cannot return without calling itself.
Closes#17899.
---
This is highly non-perfect, in particular, my wording above is quite precise, and I have some unresolved questions: This currently will warn about
```rust
fn foo() {
if bar { loop {} }
foo()
}
```
even though `foo` may never be called (i.e. our apparent "unconditional" recursion is actually conditional). I don't know if we should handle this case, and ones like it with `panic!()` instead of `loop` (or anything else that "returns" `!`).
However, strictly speaking, it seems to me that changing the above to not warn will require changing
```rust
fn foo() {
while bar {}
foo()
}
```
to also not warn since it could be that the `while` is an infinite loop and doesn't ever hit the `foo`.
I'm inclined to think we let these cases warn since true edge cases like the first one seem rare, and if they do occur they seem like they would usually be typos in the function call. (I could imagine someone accidentally having code like `fn foo() { assert!(bar()); foo() /* meant to be boo() */ }` which won't warn if the `loop` case is "fixed".)
(Part of the reason this is unresolved is wanting feedback, part of the reason is I couldn't devise a strategy that worked in all cases.)
---
The name `unconditional_self_calls` is kinda clunky; and this reconstructs the CFG for each function that is linted which may or may not be very expensive, I don't know.
This ends up propagating all the way out to the output of dep-info which then
makes Cargo think that files are not existent (it thinks the files have quotes
in their name) when they in fact do.
This commit relaxes the bound on `Result::unwrap` and `Result::unwrap_err` from
the `Display` trait to the `Debug` trait for generating an error message about
the unwrapping operation.
This commit is a breaking change and any breakage should be mitigated by
ensuring that `Debug` is implemented on the relevant type.
[breaking-change]
I'm beginning to suspect it's impossible to avoid accidentally writing
`#[deriving]` at least once in every program, and it results in
non-intuitive error messages: "Foo doesn't have any method in scope
`clone`" despite there being a `#[deriv...(Clone)]` attribute!
Also, lots of documentation around the internet uses `#[deriving]` so
providing this guidance is very helpful (lots of people ask in #rust
about this error).
Fixes#21166.
This adds a new lexer/parser combo for the entire Rust language can be generated with with flex and bison, taken from my project at https://github.com/bleibig/rust-grammar. There is also a testing script that runs the generated parser with all *.rs files in the repository (except for tests in compile-fail or ones that marked as "ignore-test" or "ignore-lexer-test"). If you have flex and bison installed, you can run these tests using the new "check-grammar" make target.
This does not depend on or interact with the existing testing code in the grammar, which only provides and tests a lexer specification.
OS X users should take note that the version of bison that comes with the Xcode toolchain (2.3) is too old to work with this grammar, they need to download and install version 3.0 or later.
The parser builds up an S-expression-based AST, which can be displayed by giving the "-v" argument to parser-lalr (normally it only gives output on error). It is only a rough approximation of what is parsed and doesn't capture every detail and nuance of the program.
Hopefully this should be sufficient for issue #2234, or at least a good starting point.
Per [RFC 517](https://github.com/rust-lang/rfcs/pull/575/), this commit introduces platform-native strings. The API is essentially as described in the RFC.
The WTF-8 implementation is adapted from @SimonSapin's [implementation](https://github.com/SimonSapin/rust-wtf8). To make this work, some encodign and decoding functionality in `libcore` is now exported in a "raw" fashion reusable for WTF-8. These exports are *not* reexported in `std`, nor are they stable.
Per [RFC 517](https://github.com/rust-lang/rfcs/pull/575/), this commit
introduces platform-native strings. The API is essentially as described
in the RFC.
The WTF-8 implementation is adapted from @SimonSapin's
[implementation](https://github.com/SimonSapin/rust-wtf8). To make this
work, some encodign and decoding functionality in `libcore` is now
exported in a "raw" fashion reusable for WTF-8. These exports are *not*
reexported in `std`, nor are they stable.