bors ccd238309f Auto merge of #67020 - pnkfelix:issue-59535-accumulate-past-lto-imports, r=mw
save LTO import info and check it when trying to reuse build products

Fix #59535

Previous runs of LTO optimization on the previous incremental build can import larger portions of the dependence graph into a codegen unit than the current compilation run is choosing to import. We need to take that into account when we choose to reuse PostLTO-optimization object files from previous compiler invocations.

This PR accomplishes that by serializing the LTO import information on each incremental build. We load up the previous LTO import data as well as the current LTO import data. Then as we decide whether to reuse previous PostLTO objects or redo LTO optimization, we check whether the LTO import data matches. After we finish with this decision process for every object, we write the LTO import data back to disk.

----

What is the scenario where comparing against past LTO import information is necessary?

I've tried to capture it in the comments in the regression test, but here's yet another attempt from me to summarize the situation:

 1. Consider a call-graph like `[A] -> [B -> D] <- [C]` (where the letters are functions and the modules are enclosed in `[]`)
 2. In our specific instance, the earlier compilations were inlining the call to`B` into `A`; thus `A` ended up with a external reference to the symbol `D` in its object code, to be resolved at subsequent link time. The LTO import information provided by LLVM for those runs reflected that information: it explicitly says during those runs, `B` definition and `D` declaration were imported into `[A]`.
 3. The change between incremental builds was that the call `D <- C` was removed.
 4. That change, coupled with other decisions within `rustc`, made the compiler decide to make `D` an internal symbol (since it was no longer accessed from other codegen units, this makes sense locally). And then the definition of `D` was inlined into `B` and `D` itself was eliminated entirely.
  5. The current LTO import information reported that `B` alone is imported into `[A]` for the *current compilation*. So when the Rust compiler surveyed the dependence graph, it determined that nothing `[A]` imports changed since the last build (and `[A]` itself has not changed either), so it chooses to reuse the object code generated during the previous compilation.
  6. But that previous object code has an unresolved reference to `D`, and that causes a link time failure!

----

The interesting thing is that its quite hard to actually observe the above scenario arising, which is probably why no one has noticed this bug in the year or so since incremental LTO support landed (PR #53673).

I've literally spent days trying to observe the bug on my local machine, but haven't managed to find the magic combination of factors to get LLVM and `rustc` to do just the right set of the inlining and `internal`-reclassification choices that cause this particular problem to arise.

----

Also, I have tried to be careful about injecting new bugs with this PR. Specifically, I was/am worried that we could get into a scenario where overwriting the current LTO import data with past LTO import data would cause us to "forget" a current import. ~~To guard against this, the PR as currently written always asserts, at overwrite time, that the past LTO import-set is a *superset* of the current LTO import-set. This way, the overwriting process should always be safe to run.~~
 * The previous note was written based on the first version of this PR. It has since been revised to use a simpler strategy, where we never attempt to merge the past LTO import information into the current one. We just *compare* them, and act accordingly.
 * Also, as you can see from the comments on the PR itself, I was quite right to be worried about forgetting past imports; that scenario was observable via a trivial transformation of the regression test I had devised.
2019-12-20 20:56:09 +00:00

The Rust Programming Language

This is the main source code repository for Rust. It contains the compiler, standard library, and documentation.

Quick Start

Read "Installation" from The Book.

Installing from Source

Note: If you wish to contribute to the compiler, you should read this chapter of the rustc-guide instead of this section.

The Rust build system has a Python script called x.py to bootstrap building the compiler. More information about it may be found by running ./x.py --help or reading the rustc guide.

Building on *nix

  1. Make sure you have installed the dependencies:

    • g++ 5.1 or later or clang++ 3.5 or later
    • python 2.7 (but not 3.x)
    • GNU make 3.81 or later
    • cmake 3.4.3 or later
    • curl
    • git
    • ssl which comes in libssl-dev or openssl-devel
    • pkg-config if you are compiling on Linux and targeting Linux
  2. Clone the source with git:

    $ git clone https://github.com/rust-lang/rust.git
    $ cd rust
    
  1. Configure the build settings:

    The Rust build system uses a file named config.toml in the root of the source tree to determine various configuration settings for the build. Copy the default config.toml.example to config.toml to get started.

    $ cp config.toml.example config.toml
    

    It is recommended that if you plan to use the Rust build system to create an installation (using ./x.py install) that you set the prefix value in the [install] section to a directory that you have write permissions.

    Create install directory if you are not installing in default directory

  2. Build and install:

    $ ./x.py build && ./x.py install
    

    When complete, ./x.py install will place several programs into $PREFIX/bin: rustc, the Rust compiler, and rustdoc, the API-documentation tool. This install does not include Cargo, Rust's package manager. To build and install Cargo, you may run ./x.py install cargo or set the build.extended key in config.toml to true to build and install all tools.

Building on Windows

There are two prominent ABIs in use on Windows: the native (MSVC) ABI used by Visual Studio, and the GNU ABI used by the GCC toolchain. Which version of Rust you need depends largely on what C/C++ libraries you want to interoperate with: for interop with software produced by Visual Studio use the MSVC build of Rust; for interop with GNU software built using the MinGW/MSYS2 toolchain use the GNU build.

MinGW

MSYS2 can be used to easily build Rust on Windows:

  1. Grab the latest MSYS2 installer and go through the installer.

  2. Run mingw32_shell.bat or mingw64_shell.bat from wherever you installed MSYS2 (i.e. C:\msys64), depending on whether you want 32-bit or 64-bit Rust. (As of the latest version of MSYS2 you have to run msys2_shell.cmd -mingw32 or msys2_shell.cmd -mingw64 from the command line instead)

  3. From this terminal, install the required tools:

    # Update package mirrors (may be needed if you have a fresh install of MSYS2)
    $ pacman -Sy pacman-mirrors
    
    # Install build tools needed for Rust. If you're building a 32-bit compiler,
    # then replace "x86_64" below with "i686". If you've already got git, python,
    # or CMake installed and in PATH you can remove them from this list. Note
    # that it is important that you do **not** use the 'python2' and 'cmake'
    # packages from the 'msys2' subsystem. The build has historically been known
    # to fail with these packages.
    $ pacman -S git \
                make \
                diffutils \
                tar \
                mingw-w64-x86_64-python2 \
                mingw-w64-x86_64-cmake \
                mingw-w64-x86_64-gcc
    
  4. Navigate to Rust's source code (or clone it), then build it:

    $ ./x.py build && ./x.py install
    

MSVC

MSVC builds of Rust additionally require an installation of Visual Studio 2017 (or later) so rustc can use its linker. The simplest way is to get the Visual Studio, check the “C++ build tools” and “Windows 10 SDK” workload.

(If you're installing cmake yourself, be careful that “C++ CMake tools for Windows” doesn't get included under “Individual components”.)

With these dependencies installed, you can build the compiler in a cmd.exe shell with:

> python x.py build

Currently, building Rust only works with some known versions of Visual Studio. If you have a more recent version installed the build system doesn't understand then you may need to force rustbuild to use an older version. This can be done by manually calling the appropriate vcvars file before running the bootstrap.

> CALL "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Auxiliary\Build\vcvars64.bat"
> python x.py build

Building rustc with older host toolchains

It is still possible to build Rust with the older toolchain versions listed below, but only if the LLVM_TEMPORARILY_ALLOW_OLD_TOOLCHAIN option is set to true in the config.toml file.

  • Clang 3.1
  • Apple Clang 3.1
  • GCC 4.8
  • Visual Studio 2015 (Update 3)

Toolchain versions older than what is listed above cannot be used to build rustc.

Specifying an ABI

Each specific ABI can also be used from either environment (for example, using the GNU ABI in PowerShell) by using an explicit build triple. The available Windows build triples are:

  • GNU ABI (using GCC)
    • i686-pc-windows-gnu
    • x86_64-pc-windows-gnu
  • The MSVC ABI
    • i686-pc-windows-msvc
    • x86_64-pc-windows-msvc

The build triple can be specified by either specifying --build=<triple> when invoking x.py commands, or by copying the config.toml file (as described in Installing From Source), and modifying the build option under the [build] section.

Configure and Make

While it's not the recommended build system, this project also provides a configure script and makefile (the latter of which just invokes x.py).

$ ./configure
$ make && sudo make install

When using the configure script, the generated config.mk file may override the config.toml file. To go back to the config.toml file, delete the generated config.mk file.

Building Documentation

If youd like to build the documentation, its almost the same:

$ ./x.py doc

The generated documentation will appear under doc in the build directory for the ABI used. I.e., if the ABI was x86_64-pc-windows-msvc, the directory will be build\x86_64-pc-windows-msvc\doc.

Notes

Since the Rust compiler is written in Rust, it must be built by a precompiled "snapshot" version of itself (made in an earlier stage of development). As such, source builds require a connection to the Internet, to fetch snapshots, and an OS that can execute the available snapshot binaries.

Snapshot binaries are currently built and tested on several platforms:

Platform / Architecture x86 x86_64
Windows (7, 8, 10, ...)
Linux (2.6.18 or later)
macOS (10.7 Lion or later)

You may find that other platforms work, but these are our officially supported build environments that are most likely to work.

There is more advice about hacking on Rust in CONTRIBUTING.md.

Getting Help

The Rust community congregates in a few places:

Contributing

To contribute to Rust, please see CONTRIBUTING.

Most real-time collaboration happens in a variety of channels on the Rust Discord server, with channels dedicated for getting help, community, documentation, and all major contribution areas in the Rust ecosystem. A good place to ask for help would be the #help channel.

The rustc guide might be a good place to start if you want to find out how various parts of the compiler work.

Also, you may find the rustdocs for the compiler itself useful.

License

Rust is primarily distributed under the terms of both the MIT license and the Apache License (Version 2.0), with portions covered by various BSD-like licenses.

See LICENSE-APACHE, LICENSE-MIT, and COPYRIGHT for details.

Trademark

The Rust programming language is an open source, community project governed by a core team. It is also sponsored by the Mozilla Foundation (“Mozilla”), which owns and protects the Rust and Cargo trademarks and logos (the “Rust Trademarks”).

If you want to use these names or brands, please read the media guide.

Third-party logos may be subject to third-party copyrights and trademarks. See Licenses for details.

Description
No description provided
Readme 1.4 GiB
Languages
Rust 96.2%
RenderScript 0.7%
JavaScript 0.6%
Shell 0.6%
Fluent 0.4%
Other 1.3%