The implementation essentially desugars during type collection and AST
type conversion time into the parameter scheme we have now. Only fully
qualified names--e.g. `<T as Foo>::Bar`--are supported.
This patch does not make many functional changes, but does a lot of restructuring towards the goals of #5527. This is the biggest patch, basically, that should enable most of the other patches in a relatively straightforward way.
Major changes:
- Do not track impls through trans, instead recompute as needed.
- Isolate trait matching code into its own module, carefully structure to distinguish various phases (selection vs confirmation vs fulfillment)
- Consider where clauses in their more general form
- Integrate checking of builtin bounds into the trait matching process, rather than doing it separately in kind.rs (important for opt-in builtin bounds)
What is not included:
- Where clauses are still not generalized. This should be a straightforward follow-up patch.
- Caching. I did not include much caching. I have plans for various kinds of caching we can do. Should be straightforward. Preliminary perf measurements suggested that this branch keeps compilation times roughly what they are.
- Method resolution. The initial algorithm I proposed for #5527 does not work as well as I hoped. I have a revised plan which is much more similar to what we do today.
- Deref vs deref-mut. The initial fix I had worked great for autoderef, but not for explicit deref.
- Permitting blanket impls to overlap with specific impls. Initial plan to consider all nested obligations before considering an impl to match caused many compilation errors. We have a revised plan but it is not implemented here, should be a relatively straightforward extension.
I would like to map this information back to AST nodes, so that we can print remarks with spans, and so that remarks can be enabled on a per-function basis. Unfortunately, doing this would require a lot more code restructuring — for example, we currently throw away the AST map and lots of other information before LLVM optimizations run. So for the time being, we print the remarks with debug location strings from LLVM. There's a warning if you use `-C remark` without `--debuginfo`.
Fixes#17116.
`Box<[T]>` is created by allocating `Box<[T, ..n]>` and coercing it so
this code path is never used. It's also broken because it clamps the
capacity of the memory allocations to 4 elements and that's incompatible
with sized deallocation. This dates back to when `~[T]` was a growable
vector type implemented as:
*{ { tydesc, ref_count, prev, next }, { length, capacity, data[] } }
Since even empty vectors had to allocate, it started off the capacity of
all vectors at 4 as a heuristic. It's not possible to grow `Box<[T]>`
and there is no need for a memory allocation when it's empty, so it
would be a terrible heuristic today even if it worked.
`Box<[T]>` is created by allocating `Box<[T, ..n]>` and coercing it so
this code path is never used. It's also broken because it clamps the
capacity of the memory allocations to 4 elements and that's incompatible
with sized deallocation. This dates back to when `~[T]` was a growable
vector type implemented as:
*{ { tydesc, ref_count, prev, next }, { length, capacity, data[] } }
Since even empty vectors had to allocate, it started off the capacity of
all vectors at 4 as a heuristic. It's not possible to grow `Box<[T]>`
and there is no need for a memory allocation when it's empty, so it
would be a terrible heuristic today even if it worked.
The pointer in the slice must not be null, because enum representations
make that assumption. The `exchange_malloc` function returns a non-null
sentinel for the zero size case, and it must not be passed to the
`exchange_free` lang item.
Since the length is always equal to the true capacity, a branch on the
length is enough for most types. Slices of zero size types are
statically special cased to never attempt deallocation. This is the same
implementation as `Vec<T>`.
Closes#14395
This allows code to access the fields of tuples and tuple structs:
let x = (1i, 2i);
assert_eq!(x.1, 2);
struct Point(int, int);
let origin = Point(0, 0);
assert_eq!(origin.0, 0);
assert_eq!(origin.1, 0);
The pointer in the slice must not be null, because enum representations
make that assumption. The `exchange_malloc` function returns a non-null
sentinel for the zero size case, and it must not be passed to the
`exchange_free` lang item.
Since the length is always equal to the true capacity, a branch on the
length is enough for most types. Slices of zero size types are
statically special cased to never attempt deallocation. This is the same
implementation as `Vec<T>`.
Closes#14395
A match in callee.rs was recognizing some foreign fns as named tuple constructors. A reproducible test case for this is nearly impossible since it depends on the way NodeIds happen to be assigned in different crates.
Fixes#15913
Adjust the handling of `#[inline]` items so that they get translated into every
compilation unit that uses them. This is necessary to preserve the semantics
of `#[inline(always)]`.
Crate-local `#[inline]` functions and statics are blindly translated into every
compilation unit. Cross-crate inlined items and monomorphizations of
`#[inline]` functions are translated the first time a reference is seen in each
compilation unit. When using multiple compilation units, inlined items are
given `available_externally` linkage whenever possible to avoid duplicating
object code.
Add a post-processing pass to `trans` that converts symbols from external to
internal when possible. Translation with multiple compilation units initially
makes most symbols external, since it is not clear when translating a
definition whether that symbol will need to be accessed from another
compilation unit. This final pass internalizes symbols that are not reachable
from other crates and not referenced from other compilation units, so that LLVM
can perform more aggressive optimizations on those symbols.
Use a shared lookup table of previously-translated monomorphizations/glue
functions to avoid translating those functions in every compilation unit where
they're used. Instead, the function will be translated in whichever
compilation unit uses it first, and the remaining compilation units will link
against that original definition.
Rotate between compilation units while translating. The "worker threads"
commit added support for multiple compilation units, but only translated into
one, leaving the rest empty. With this commit, `trans` rotates between various
compilation units while translating, using a simple stragtegy: upon entering a
module, switch to translating into whichever compilation unit currently
contains the fewest LLVM instructions.
Most of the actual changes here involve getting symbol linkage right, so that
items translated into different compilation units will link together properly
at the end.
When inlining an item from another crate, use the original symbol from that
crate's metadata instead of generating a new symbol using the `ast::NodeId` of
the inlined copy. This requires exporting symbols in the crate metadata in a
few additional cases. Having predictable symbols for inlined items will be
useful later to avoid generating duplicate object code for inlined items.
Refactor the code in `llvm::back` that invokes LLVM optimization and codegen
passes so that it can be called from worker threads. (Previously, it used
`&Session` extensively, and `Session` is not `Share`.) The new code can handle
multiple compilation units, by compiling each unit to `crate.0.o`, `crate.1.o`,
etc., and linking together all the `crate.N.o` files into a single `crate.o`
using `ld -r`. The later linking steps can then be run unchanged.
The new code preserves the behavior of `--emit`/`-o` when building a single
compilation unit. With multiple compilation units, the `--emit=asm/ir/bc`
options produce multiple files, so combinations like `--emit=ir -o foo.ll` will
not actually produce `foo.ll` (they instead produce several `foo.N.ll` files).
The new code supports `-Z lto` only when using a single compilation unit.
Compiling with multiple compilation units and `-Z lto` will produce an error.
(I can't think of any good reason to do such a thing.) Linking with `-Z lto`
against a library that was built as multiple compilation units will also fail,
because the rlib does not contain a `crate.bytecode.deflate` file. This could
be supported in the future by linking together the `crate.N.bc` files produced
when compiling the library into a single `crate.bc`, or by making the LTO code
support multiple `crate.N.bytecode.deflate` files.
Break up `CrateContext` into `SharedCrateContext` and `LocalCrateContext`. The
local piece corresponds to a single compilation unit, and contains all
LLVM-related components. (LLVM data structures are tied to a specific
`LLVMContext`, and we will need separate `LLVMContext`s to safely run
multithreaded optimization.) The shared piece contains data structures that
need to be shared across all compilation units, such as the `ty::ctxt` and some
tables related to crate metadata.
They were only correct in the simplest case. Some of the optimisations
are certainly possible but should be introduced carefully and only
when the whole pattern codegen infrastructure is in a better shape.
Fixes#16648.