Previously the docs suggested that '❤️' doesn't fit in a char because
it's 6 bytes. But that's misleading. 'a̚' also doesn't fit in a char,
even though it's only 3 bytes. The important thing is the number of code
points, not the number of bytes. Clarify the primitive char docs around
this.
The first time I read the docs for `insert()`, I thought it was saying it didn't update existing *values*, and I was confused. Reword the docs to make it clear that `insert()` does update values.
Because we no longer use `GetFileAttributesExW` FileAttr is never created directly from `WIN32_FILE_ATTRIBUTE_DATA` anymore. So we should no longer store FileAttr's attributes in that c struct.
r? @alexcrichton
Is this what you had in mind?
This follows the pattern already used for stat functions from #31551. Now
`ftruncate`, `lseek`, and `readdir_r` use their explicit 64-bit variants for
LFS support, using wider `off_t` and `dirent` types. This also updates to
`open64`, which uses no different types but implies the `O_LARGEFILE` flag.
Non-Linux platforms just map their normal functions to the 64-bit names.
r? @alexcrichton
The first time I read the docs for `insert()`, I thought it was saying
it didn't update existing *values*, and I was confused. Reword the docs
to make it clear that `insert()` does update values.
Previously the docs suggested that '❤️' doesn't fit in a char because
it's 6 bytes. But that's misleading. 'a̚' also doesn't fit in a char,
even though it's only 3 bytes. The important thing is the number of code
points, not the number of bytes. Clarify the primitive char docs around
this.
Because we no longer use `GetFileAttributesExW` FileAttr is never created
directly from `WIN32_FILE_ATTRIBUTE_DATA` anymore.
So we should no longer store FileAttr's attributes in that c struct.
This commit is an implementation of [RFC 1415][rfc] which deprecates all types
in the `std::os::*::raw` modules.
[rfc]: https://github.com/rust-lang/rfcs/blob/master/text/1415-trim-std-os.md
Many of the types in these modules don't actually have a canonical platform
representation, for example the definition of `stat` on 32-bit Linux will change
depending on whether C code is compiled with LFS support or not. Unfortunately
the current types in `std::os::*::raw` are billed as "compatible with C", which
in light of this means it isn't really possible.
To make matters worse, platforms like Android sometimes define these types as
*smaller* than the way they're actually represented in the `stat` structure
itself. This means that when methods like `DirEntry::ino` are called on Android
the result may be truncated as we're tied to returning a `ino_t` type, not the
underlying type.
The commit here incorporates two backwards-compatible components:
* Deprecate all `raw` types that aren't in `std::os::raw`
* Expand the `std::os::*::fs::MetadataExt` trait on all platforms for method
accessors of all fields. The fields now returned widened types which are the
same across platforms (consistency across platforms is not required, however,
it's just convenient).
and two also backwards-incompatible components:
* Change the definition of all `std::os::*::raw` type aliases to
correspond to the newly widened types that are being returned on each
platform.
* Change the definition of `std::os::*::raw::stat` on Linux to match the LFS
definitions rather than the standard ones.
The breaking changes here will specifically break code that assumes that `libc`
and `std` agree on the definition of `std::os::*::raw` types, or that the `std`
types are faithful representations of the types in C. An [audit] has been
performed of crates.io to determine the fallout which was determined two be
minimal, with the two found cases of breakage having been fixed now.
[audit]: https://github.com/rust-lang/rfcs/pull/1415#issuecomment-180645582
---
Ok, so after all that, we're finally able to support LFS on Linux! This commit
then simultaneously starts using `stat64` and friends on Linux to ensure that we
can open >4GB files on 32-bit Linux. Yay!
Closes#28978Closes#30050Closes#31549
This commit is an implementation of [RFC 1415][rfc] which deprecates all types
in the `std::os::*::raw` modules.
[rfc]: https://github.com/rust-lang/rfcs/blob/master/text/1415-trim-std-os.md
Many of the types in these modules don't actually have a canonical platform
representation, for example the definition of `stat` on 32-bit Linux will change
depending on whether C code is compiled with LFS support or not. Unfortunately
the current types in `std::os::*::raw` are billed as "compatible with C", which
in light of this means it isn't really possible.
To make matters worse, platforms like Android sometimes define these types as
*smaller* than the way they're actually represented in the `stat` structure
itself. This means that when methods like `DirEntry::ino` are called on Android
the result may be truncated as we're tied to returning a `ino_t` type, not the
underlying type.
The commit here incorporates two backwards-compatible components:
* Deprecate all `raw` types that aren't in `std::os::raw`
* Expand the `std::os::*::fs::MetadataExt` trait on all platforms for method
accessors of all fields. The fields now returned widened types which are the
same across platforms (consistency across platforms is not required, however,
it's just convenient).
and two also backwards-incompatible components:
* Change the definition of all `std::os::*::raw` type aliases to
correspond to the newly widened types that are being returned on each
platform.
* Change the definition of `std::os::*::raw::stat` on Linux to match the LFS
definitions rather than the standard ones.
The breaking changes here will specifically break code that assumes that `libc`
and `std` agree on the definition of `std::os::*::raw` types, or that the `std`
types are faithful representations of the types in C. An [audit] has been
performed of crates.io to determine the fallout which was determined two be
minimal, with the two found cases of breakage having been fixed now.
[audit]: https://github.com/rust-lang/rfcs/pull/1415#issuecomment-180645582
---
Ok, so after all that, we're finally able to support LFS on Linux! This commit
then simultaneously starts using `stat64` and friends on Linux to ensure that we
can open >4GB files on 32-bit Linux. Yay!
Closes#28978Closes#30050Closes#31549
Fix `read_link` to also be able to read the target of junctions on Windows.
Also the path returned should not include a NT namespace, and there were
some problems with permissions.
This series of commits adds the initial implementation of a new build system for
the compiler and standard library based on Cargo. The high-level architecture
now looks like:
1. The `./configure` script is run with `--enable-rustbuild` and other standard
configuration options.
2. A `Makefile` is generate which proxies commands to the new build system.
3. The new build system has a Python script entry point which manages
downloading both a Rust and Cargo nightly. This initial script also manages
building the build system itself (which is written in Rust).
4. The build system, written in rust and called `bootstrap`, architects how to
call `cargo` and manages building all native libraries and such.
One might reasonably ask "why rewrite the build system?", which is a good
question! The Rust project has used Makefiles for as long as I can remember at
least, and while ugly and difficult to use are undeniably robust as they contain
years worth of tweaking and tuning for working on as many platforms in as many
situation as possible. The rationale behind this PR, however is:
* The makefiles are impenetrable to all but a few people on this
planet. This means that contributions to the build system are almost
nonexistent, and furthermore if a build system change is needed it's
incredibly difficult to figure out how to do so. This hindrance prevents us
from doing some "perhaps fancier" things we may wish to do in make.
* Our build system, while portable, is unfortunately not infinitely portable
everywhere. For example the recently-introduced MSVC target is quite unlikely
to have `make` installed by default (e.g. it requires building inside of an
MSYS2 shell currently). Conversely, the portability of make comes at a cost of
crazy and weird hacks to work around all sorts of versions of software
everywhere, especially when it comes to the configure script and makefiles.
By rewriting this logic in one of the most robust platforms there is, Rust,
we get to assuage all of these worries for free!
* There's a standard tool to build Rust crates, Cargo, but the standard library
and compiler don't use it. This means that they cannot benefit easily from the
crates.io ecosystem, nor can the ecosystem benefit from a standard way to
build this repository itself. Moving to Cargo should help assuage both of
these needs. This has the added benefit of making the compiler more
approachable for newbies as working on the compiler will just happen to be
working on a large Cargo project, all the same standard tools and tricks will
apply.
* There's a huge amount of portability information in the main distribution, for
example around cross compiling, compiling on new OSes, etc. Pushing this logic
into standard crates (like `gcc`) enables the community to immediately benefit
from new build logic.
Despite these benefits, it's going to be a long road to actually replace our
current build system. This PR is just the beginning and doesn't implement the
full suite of functionality as the current one, but there are many more to
follow! The current implementation strategy hopes to look like:
1. Land a second build system in-tree that can be itereated on an and
contributed to. This will not be used just yet in terms of gating new commits
to the repo.
2. Over time, bring the second build system to feature parity with the old build
system, start setting up CI for both build systems.
3. At some point in the future, switch the default to the new build system, but
keep the old one around.
4. At some further point in the future, delete the entire old build system.
---
Alright, so with all that out of the way, here's some more info on this PR
itself. The inital build system here is contained in the `src/bootstrap`
directory and just adds the necessary minimum bits to bootstrap the compiler
itself. There is currently no support for building documentation, running tests,
or installing, but the implemented support is:
* Compiling LLVM with `cmake` instead of `./configure` + `make`. The LLVM
project is removing their autotools build system, so we'd have to make this
transition eventually anyway.
* Compiling compiler-rt with `cmake` as well (for the same rationale as above).
* Adding `Cargo.toml` to map out the dependency graph to all crates, and also
adding `build.rs` files where appropriate. For example `alloc_jemalloc` has a
script to build jemalloc, `flate` has a script to build `miniz.c`, `std` will
build `libbacktrace`, etc.
* Orchestrating all the calls to `cargo` to build the standard distribution,
following the normal bootstrapping process. This also tracks dependencies
between steps to ensure cross-compilation targets happen as well.
* Configuration is intended to eventually be done through a `config.toml` file,
so support is implemented for this. The most likely vector of configuration
for now, however, is likely through `config.mk` (what `./configure` emits), so
the build system currently parses this information.
There's still quite a few steps left to do, and I'll open up some follow-up
issues (as well as a tracking issue) for this migration, but hopefully this is a
great start to get going! This PR is currently tested on all the
Windows/Linux/OSX triples for x86\_64 and x86, but more portability is always
welcome!
---
Future functionality left to implement
* [ ] Re-verify that multi-host builds work
* [ ] Verify android build works
* [ ] Verify iOS build work (mostly compiler-rt)
* [ ] Verify sha256 and ideally gpg of downloaded nightly compiler and nightly rustc
* [ ] Implement testing -- this is a huge bullet point with lots of sub-bullets
* [ ] Build and generate documentation (plus the various tools we have in-tree)
* [ ] Move various src/etc scripts into Rust -- not sure how this interacts with `make` build system
* [ ] Implement `make install` - like testing this is also quite massive
* [x] Deduplicate version information with makefiles
Have all Cargo-built crates pass `--cfg cargobuild` and then add appropriate
`#[cfg]` definitions to all crates to avoid linking anything if this is passed.
This should help allow libstd to compile with both the makefiles and with Cargo.
This commits adds build scripts to the necessary Rust crates for all the native
dependencies. This is currently a duplication of the support found in mk/rt.mk
and is my best effort at representing the logic twice, but there may be some
unfortunate-and-inevitable divergence.
As a summary:
* alloc_jemalloc - build script to compile jemallocal
* flate - build script to compile miniz.c
* rustc_llvm - build script to run llvm-config and learn about how to link it.
Note that this crucially (and will not ever) compile LLVM as that would take
far too long.
* rustdoc - build script to compile hoedown
* std - script to determine lots of libraries/linkages as well as compile
libbacktrace
As demonstrated in the `resolve_socket_addr` change, this is less awkward than re-creating a new address from the other parts.
If this is to be accepted, pleas open a tracking issue (I can’t set the appropriate tags) and I’ll update the PR with the tracking issue number.
These commits are an implementation of https://github.com/rust-lang/rfcs/pull/1359 which is tracked via https://github.com/rust-lang/rust/issues/31398. The `before_exec` implementation fit easily with the current process spawning framework we have, but unfortunately the `exec` implementation required a bit of a larger refactoring. The stdio handles were all largely managed as implementation details of `std::process` and the `exec` function lived in `std::sys`, so the two didn't have access to one another.
I took this as a sign that a deeper refactoring was necessary, and I personally feel that the end result is cleaner for both Windows and Unix. The commits should be separated nicely for reviewing (or all at once if you're feeling ambitious), but the changes made here were:
* The process spawning on Unix was refactored in to a pre-exec and post-exec function. The post-exec function isn't allowed to do any allocations of any form, and management of transmitting errors back to the parent is managed by the pre-exec function (as it's the one that actually forks).
* Some management of the exit status was pushed into platform-specific modules. On Unix we must cache the return value of `wait` as the pid is consumed after we wait on it, but on Windows we can just keep querying the system because the handle stays valid.
* The `Stdio::None` variant was renamed to `Stdio::Null` to better reflect what it's doing.
* The global lock on `CreateProcess` is now correctly positioned to avoid unintended inheritance of pipe handles that other threads are sending to their child processes. After a more careful reading of the article referenced the race is not in `CreateProcess` itself, but rather the property that handles are unintentionally shared.
* All stdio management now happens in platform-specific modules. This provides a cleaner implementation/interpretation for `FromFraw{Fd,Handle}` for each platform as well as a cleaner transition from a configuration to what-to-do once we actually need to do the spawn.
With these refactorings in place, implementing `before_exec` and `exec` ended up both being pretty trivial! (each in their own commit)
This commit implements the `exec` function proposed in [RFC 1359][rfc] which is
a function on the `CommandExt` trait to execute all parts of a `Command::spawn`
without the `fork` on Unix. More details on the function itself can be found in
the comments in the commit.
[rfc]: https://github.com/rust-lang/rfcs/pull/1359
cc #31398
Most of this is platform-specific anyway, and we generally have to jump through
fewer hoops to do the equivalent operation on Windows. One benefit for Windows
today is that this new structure avoids an extra `DuplicateHandle` when creating
pipes. For Unix, however, the behavior should be the same.
Note that this is just a pure refactoring, no functionality was added or
removed.
The function `CreateProcess` is not itself unsafe to call from many threads, the
article in question is pointing out that handles can be inherited by unintended
child processes. This is basically the same race as the standard Unix
open-then-set-cloexec race.
Since the intention of the lock is to protect children from inheriting
unintended handles, the lock is now lifted out to before the creation of the
child I/O handles (which will all be inheritable). This will ensure that we only
have one process in Rust at least creating inheritable handles at a time,
preventing unintended inheritance to children.
On Unix we have to be careful to not call `waitpid` twice, but we don't have to
be careful on Windows due to the way process handles work there. As a result the
cached `Option<ExitStatus>` is only necessary on Unix, and it's also just an
implementation detail of the Unix module.
At the same time. also update some code in `kill` on Unix to avoid a wonky
waitpid with WNOHANG. This was added in 0e190b9a to solve #13124, but the
`signal(0)` method is not supported any more so there's no need to for this
workaround. I believe that this is no longer necessary as it's not really doing
anything.
This is a Unix-specific function which adds the ability to register a closure to
run pre-exec to configure the child process as required (note that these
closures are run post-fork).
cc #31398
* Build up the argp/envp pointers while the `Command` is being constructed
rather than only when `spawn` is called. This will allow better sharing of
code between fork/exec paths.
* Rename `child_after_fork` to `exec` and have it only perform the exec half of
the spawning. This also means the return type has changed to `io::Error`
rather than `!` to represent errors that happen.
After [considerable pushback](https://github.com/rust-lang/rfcs/issues/1451), it's clear that there is a community consensus around providing `IpAddr` in the standard library, together with other APIs using it.
This commit reverts from deprecated status directly to stable. The deprecation landed in 1.6, which has already been released, so the stabilization is marked for 1.7 (currently in beta; will require a backport).
r? @alexcrichton
This commit does two things:
* Re-works the module-level documentation.
* Cleaning up wording and adding links to where error types are used.
Part of #29364
This commit does two things:
* Re-works the module-level documentation.
* Cleaning up wording and adding links to where error types are used.
Part of #29364
After [considerable
pushback](https://github.com/rust-lang/rfcs/issues/1451), it's clear
that there is a community consensus around providing `IpAddr` in the
standard library, together with other APIs using it.
This commit reverts from deprecated status directly to stable. The
deprecation landed in 1.6, which has already been released, so the
stabilization is marked for 1.7 (currently in beta; will require a backport).
The comment in the next line was already talking about `_guard`, and the
scope guard a couple lines further down is also called `guard`, so I
assume that was just a typo.
Also update the instability reason to include a note about a possible
bad interaction with condition variables on systems that allow
waiting on a RwLock guard.
Here's another go at adding emscripten support. This needs to wait again on new [libc definitions](https://github.com/rust-lang-nursery/libc/pull/122) landing. To get the libc definitions right I had to add support for i686-unknown-linux-musl, which are very similar to emscripten's, which are derived from arm/musl.
This branch additionally removes the makefile dependency on the `EMSCRIPTEN` environment variable by not building the unused compiler-rt.
Again, this is not sufficient for actually compiling to asmjs since it needs additional LLVM patches.
r? @alexcrichton
Backtraces, and the compilation of libbacktrace for asmjs, are disabled.
This port doesn't use jemalloc so, like pnacl, it disables jemalloc *for all targets*
in the configure file.
It disables stack protection.
It could return in the future if it returned a different guard type, which
could not be used with Condvar, otherwise it is unsafe as another thread
can invalidate an "inner" reference during a Condvar::wait.
cc #27746
These commits finish up closing out https://github.com/rust-lang/rust/issues/24237 by filling out all locations we create new file descriptors with variants that atomically create the file descriptor and set CLOEXEC where possible. Previous support for doing this in `File::open` was added in #27971 and support for `try_clone` was added in #27980. This commit fills out:
* `Socket::new` now passes `SOCK_CLOEXEC`
* `Socket::accept` now uses `accept4`
* `pipe2` is used instead of `pipe`
Unfortunately most of this support is Linux-specific, and most of it is post-2.6.18 (our oldest supported version), so all of the detection here is done dynamically. It looks like OSX does not have equivalent variants for these functions, so there's nothing more we can do there. Support for BSDs can be added over time if they also have these functions.
Closes#24237
Abort on stack overflow instead of re-raising SIGSEGV
We use guard pages that cause the process to abort to protect against
undefined behavior in the event of stack overflow. We have a handler
that catches segfaults, prints out an error message if the segfault was
due to a stack overflow, then unregisters itself and returns to allow
the signal to be re-raised and kill the process.
This caused some confusion, as it was unexpected that safe code would be
able to cause a segfault, while it's easy to overflow the stack in safe
code. To avoid this confusion, when we detect a segfault in the guard
page, abort instead of the previous behavior of re-raising SIGSEGV.
To test this, we need to adapt the tests for segfault to actually check
the exit status. Doing so revealed that the existing test for segfault
behavior was actually invalid; LLVM optimizes the explicit null pointer
reference down to an illegal instruction, so the program aborts with
SIGILL instead of SIGSEGV and the test didn't actually trigger the
signal handler at all. Use a C helper function to get a null pointer
that LLVM can't optimize away, so we get our segfault instead.
This is a [breaking-change] if anyone is relying on the exact signal
raised to kill a process on stack overflow.
Closes#31273
Also update the instability reason to include a note about a possible
bad interaction with condition variables on systems that allow
waiting on a RwLock guard.
We use guard pages that cause the process to abort to protect against
undefined behavior in the event of stack overflow. We have a handler
that catches segfaults, prints out an error message if the segfault was
due to a stack overflow, then unregisters itself and returns to allow
the signal to be re-raised and kill the process.
This caused some confusion, as it was unexpected that safe code would be
able to cause a segfault, while it's easy to overflow the stack in safe
code. To avoid this confusion, when we detect a segfault in the guard
page, abort instead of the previous behavior of re-raising the SIGSEGV.
To test this, we need to adapt the tests for segfault to actually check
the exit status. Doing so revealed that the existing test for segfault
behavior was actually invalid; LLVM optimizes the explicit null pointer
reference down to an illegal instruction, so the program aborts with
SIGILL instead of SIGSEGV and the test didn't actually trigger the
signal handler at all. Use a C helper function to get a null pointer
that LLVM can't optimize away, so we get our segfault instead.
This is a [breaking-change] if anyone is relying on the exact signal
raised to kill a process on stack overflow.
Closes#31273
This commit attempts to use the `pipe2` syscall on Linux to atomically set the
CLOEXEC flag for pipes created. Unfortunately this was added in 2.6.27 so we
have to dynamically determine whether we can use it or not.
This commit also updates the `fds-are-cloexec.rs` test to test stdio handles for
spawned processes as well.
This is necessary to atomically accept a socket and set the CLOEXEC flag at the
same time. Support only appeared in Linux 2.6.28 so we have to dynamically
determine which syscall we're supposed to call in this case.
Right now we only attempt to call one symbol which my not exist everywhere,
__pthread_get_minstack, but this pattern will come up more often as we start to
bind newer functionality of systems like Linux.
Take a similar strategy as the Windows implementation where we use `dlopen` to
lookup whether a symbol exists or not.
This commit adds support for creating sockets with the `SOCK_CLOEXEC` flag.
Support for this flag was added in Linux 2.6.27, however, and support does not
exist on platforms other than Linux. For this reason we still have the same
fallback as before but just special case Linux if we can.
Similar to the previous commit, if `F_DUPFD_CLOEXEC` succeeds then there's no
need for us to then call `set_cloexec` on platforms other than Linux. The bug
mentioned of kernels not actually setting the `CLOEXEC` flag has only been
repored on Linux, not elsewhere.
On Linux we have to do this for binary compatibility with 2.6.18, but for other
OSes (e.g. OSX/BSDs/etc) they all support this flag so we don't need to pass it.
It could return in the future if it returned a different guard type, which
could not be used with Condvar, otherwise it is unsafe as another thread
can invalidate an "inner" reference during a Condvar::wait.
cc #27746
These accessors are used to get at the last modification, last access, and
creation time of the underlying file. Currently not all platforms provide the
creation time, so that currently returns `Option`.
These accessors are used to get at the last modification, last access, and
creation time of the underlying file. Currently not all platforms provide the
creation time, so that currently returns `Option`.