diff --git a/compiler/rustc_middle/src/mir/mod.rs b/compiler/rustc_middle/src/mir/mod.rs index 578fcd82ad6..394bc12f015 100644 --- a/compiler/rustc_middle/src/mir/mod.rs +++ b/compiler/rustc_middle/src/mir/mod.rs @@ -1785,8 +1785,98 @@ pub struct CopyNonOverlapping<'tcx> { /////////////////////////////////////////////////////////////////////////// // Places -/// A path to a value; something that can be evaluated without -/// changing or disturbing program state. +/// Places roughly correspond to a "location in memory." Places in MIR are the same mathematical +/// object as places in Rust. This of course means that what exactly they are is undecided and part +/// of the Rust memory model. However, they will likely contain at least the following three pieces +/// of information in some form: +/// +/// 1. The part of memory that is referred to (see discussion below for details). +/// 2. The type of the place and an optional variant index. See [`PlaceTy`][tcx::PlaceTy] +/// 3. The provenance with which the place is being accessed. +/// +/// We'll give a description below of how the first two of these three properties are computed for a +/// place. We cannot give a description of the provenance, because that is part of the undecided +/// aliasing model - we only include it here at all to acknowledge its existence. +/// +/// For a place that has no projections, ie `Place { local, projection: [] }`, the part of memory is +/// the local's full allocation and the type is the type of the local. For any other place, we +/// define the values as a function of the parent place, that is the place with its last +/// [`ProjectionElem`] stripped. The way this is computed of course depends on the kind of that last +/// projection element: +/// +/// - [`Downcast`](ProjectionElem::Downcast): This projection sets the place's variant index to the +/// given one, and makes no other changes. A `Downcast` projection on a place with its variant +/// index already set is not well-formed. +/// - [`Field`](ProjectionElem::Field): `Field` projections take their parent place and create a +/// place referring to one of the fields of the type. The referred to place in memory is where +/// the layout places the field. The type becomes the type of the field. +/// +/// These projections are only legal for tuples, ADTs, closures, and generators. If the ADT or +/// generator has more than one variant, the parent place's variant index must be set, indicating +/// which variant is being used. If it has just one variant, the variant index may or may not be +/// included - the single possible variant is inferred if it is not included. +/// - [`ConstantIndex`](ProjectionElem::ConstantIndex): Computes an offset in units of `T` into the +/// place as described in the documentation for the `ProjectionElem`. The resulting part of +/// memory is the location of that element of the array/slice, and the type is `T`. This is only +/// legal if the parent place has type `[T; N]` or `[T]` (*not* `&[T]`). +/// - [`Subslice`](ProjectionElem::Subslice): Much like `ConstantIndex`. It is also only legal on +/// `[T; N]` and `[T]`. However, this yields a `Place` of type `[T]`, and may refer to more than +/// one element in the parent place. +/// - [`Index`](ProjectionElem::Index): Like `ConstantIndex`, only legal on `[T; N]` or `[T]`. +/// However, `Index` additionally takes a local from which the value of the index is computed at +/// runtime. Computing the value of the index involves interpreting the `Local` as a +/// `Place { local, projection: [] }`, and then computing its value as if done via +/// [`Operand::Copy`]. The array/slice is then indexed with the resulting value. The local must +/// have type `usize`. +/// - [`Deref`](ProjectionElem::Deref): Derefs are the last type of projection, and the most +/// complicated. They are only legal on parent places that are references, pointers, or `Box`. A +/// `Deref` projection begins by creating a value from the parent place, as if by +/// [`Operand::Copy`]. It then dereferences the resulting pointer, creating a place of the +/// pointed to type. +/// +/// **Needs clarification**: What about metadata resulting from dereferencing wide pointers (and +/// possibly from accessing unsized locals - not sure how those work)? That probably deserves to go +/// on the list above and be discussed too. It is also probably necessary for making the indexing +/// stuff lass hand-wavey. +/// +/// **Needs clarification**: When it says "part of memory" what does that mean precisely, and how +/// does it interact with the metadata? +/// +/// One possible model that I believe makes sense is that "part of memory" is actually just the +/// address of the beginning of the referred to range of bytes. For sized types, the size of the +/// range is then stored in the type, and for unsized types it's stored (possibly indirectly, +/// through a vtable) in the metadata. +/// +/// Alternatively, the "part of memory" could be a whole range of bytes. Initially seemed more +/// natural to me, but seems like it falls apart after a little bit. +/// +/// More likely though, we should call this detail a part of the Rust memory model and let that deal +/// with the precise definition of this part of a place. If we feel strongly, I don't think we *have +/// to* though. MIR places are more flexible than Rust places, and we might be able to make a +/// decision on the flexible parts without semi-stabilizing the source language. (end NC) +/// +/// Computing a place may be UB - this is certainly the case with dereferencing, which requires +/// sufficient provenance, but it may additionally be the case for some of the other field +/// projections. +/// +/// It is undecided when this UB kicks in. As best I can tell that is the question being discussed +/// in [UCG#319]. Summarizing from that thread, I believe the options are: +/// +/// [UCG#319]: https://github.com/rust-lang/unsafe-code-guidelines/issues/319 +/// +/// 1. Each intermediate place must have provenance for the whole part of memory it refers to. This +/// is the status quo. +/// 2. Only for intermediate place where the last projection was *not* a deref. This corresponds to +/// "Check inbounds on place projection". +/// 3. Only on place to value conversions, assignments, and referencing operation. This corresponds +/// to "remove the restrictions from `*` entirely." +/// 4. On each intermediate place if the place is used for a place to value conversion as part of +/// an assignment assignment or it is used for a referencing operation. For a raw pointer +/// computation, never. This corresponds to "magic?". +/// +/// Hopefully I am not misrepresenting anyone's opinions - please let me know if I am. Currently, +/// Rust chooses option 1. This is checked by MIRI and taken advantage of by codegen (via `gep +/// inbounds`). That is possibly subject to change. #[derive(Copy, Clone, PartialEq, Eq, Hash, TyEncodable, HashStable)] pub struct Place<'tcx> { pub local: Local, @@ -2155,24 +2245,42 @@ pub struct SourceScopeLocalData { /////////////////////////////////////////////////////////////////////////// // Operands -/// These are values that can appear inside an rvalue. They are intentionally -/// limited to prevent rvalues from being nested in one another. +/// An operand in MIR represents a "value" in Rust, the definition of which is undecided and part of +/// the memory model. One proposal for a definition of values can be found [on UCG][value-def]. +/// +/// [value-def]: https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/value-domain.md +/// +/// The most common way to create values is via a place to value conversion. A place to value +/// conversion is an operation which reads the memory of the place and converts it to a value. This +/// is a fundamentally *typed* operation. Different types will do different things. These are some +/// possible examples of what Rust may - but will not necessarily - decide to do on place to value +/// conversions: +/// +/// 1. Types with validity constraints cause UB if the validity constraint is not met +/// 2. References/pointers may have their provenance change or cause other provenance related +/// side-effects. +/// +/// A place to value conversion on a place that has its variant index set is not well-formed. +/// However, note that this rule only applies to places appearing in MIR bodies. Many functions, +/// such as [`Place::ty`], still accept such a place. If you write a function for which it might be +/// ambiguous whether such a thing is accepted, make sure to document your choice clearly. #[derive(Clone, PartialEq, TyEncodable, TyDecodable, Hash, HashStable)] pub enum Operand<'tcx> { - /// Copy: The value must be available for use afterwards. - /// - /// This implies that the type of the place must be `Copy`; this is true - /// by construction during build, but also checked by the MIR type checker. + /// Creates a value by performing a place to value conversion at the given place. The type of + /// the place must be `Copy` Copy(Place<'tcx>), - /// Move: The value (including old borrows of it) will not be used again. + /// Creates a value by performing a place to value conversion for the place, just like the + /// `Copy` operand. /// - /// Safe for values of all types (modulo future developments towards `?Move`). - /// Correct usage patterns are enforced by the borrow checker for safe code. - /// `Copy` may be converted to `Move` to enable "last-use" optimizations. + /// This *may* additionally overwrite the place with `uninit` bytes, depending on how we decide + /// in [UCG#188]. You should not emit MIR that may attempt a subsequent second place to value + /// conversion on this place without first re-initializing it. + /// + /// [UCG#188]: https://github.com/rust-lang/unsafe-code-guidelines/issues/188 Move(Place<'tcx>), - /// Synthesizes a constant value. + /// Constants are already semantically values, and remain unchanged. Constant(Box>), }