Updated the HashMap's documentation to include two references to
add_modify.
The first is when the `Entry` API is mentioned at the beginning. I was
hesitant to change the "attack" example (although I believe that it is
perfect example of where `add_modify` should be used) because both uses
work equally, but one is more idiomatic (`add_modify`).
The second is with the `entry` function that is used for the `Entry`
API. The code example was a perfect use for `add_modify`, which is why
it was changed to reflect that.
This stabilizes the `Path::try_exists()` method which returns
`Result<bool, io::Error>` instead of `bool` allowing handling of errors
unrelated to the file not existing. (e.g permission errors)
Along with the stabilization it also:
* Warns that the `exists()` method is error-prone and suggests to use
the newly stabilized one.
* Suggests it instead of `metadata()` to handle errors.
* Mentions TOCTOU bugs to avoid false assumption that `try_exists()` is
completely safe fixed version of `exists()`.
* Renames the feature of still-unstable `std::fs::try_exists()` to
`fs_try_exists` to avoid name conflict.
The tracking issue #83186 remains open to track `fs_try_exists`.
Integrate measureme's hardware performance counter support.
*Note: this is a companion to https://github.com/rust-lang/measureme/pull/143, and duplicates some information with it for convenience*
**(much later) EDIT**: take any numbers with a grain of salt, they may have changed since initial PR open.
## Credits
I'd like to start by thanking `@alyssais,` `@cuviper,` `@edef1c,` `@glandium,` `@jix,` `@Mark-Simulacrum,` `@m-ou-se,` `@mystor,` `@nagisa,` `@puckipedia,` and `@yorickvP,` for all of their help with testing, and valuable insight and suggestions.
Getting here wouldn't have been possible without you!
(If I've forgotten anyone please let me know, I'm going off memory here, plus some discussion logs)
## Summary
This PR adds support to `-Z self-profile` for counting hardware events such as "instructions retired" (as opposed to being limited to time measurements), using the `rdpmc` instruction on `x86_64` Linux.
While other OSes may eventually be supported, preliminary research suggests some kind of kernel extension/driver is required to enable this, whereas on Linux any user can profile (at least) their own threads.
Supporting Linux on architectures other than x86_64 should be much easier (provided the hardware supports such performance counters), and was mostly not done due to a lack of readily available test hardware.
That said, 32-bit `x86` (aka `i686`) would be almost trivial to add and test once we land the initial `x86_64` version (as all the CPU detection code can be reused).
A new flag `-Z self-profile-counter` was added, to control which of the named `measureme` counters is used, and which defaults to `wall-time`, in order to keep `-Z self-profile`'s current functionality unchanged (at least for now).
The named counters so far are:
* `wall-time`: the existing time measurement
* name chosen for consistency with `perf.rust-lang.org`
* continues to use `std::time::Instant` for a nanosecond-precision "monotonic clock"
* `instructions:u`: the hardware performance counter usually referred to as "Instructions retired"
* here "retired" (roughly) means "fully executed"
* the `:u` suffix is from the Linux `perf` tool and indicates the counter only runs while userspace code is executing, and therefore counts no kernel instructions
* *see [Caveats/Subtracting IRQs](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#Subtracting-IRQs) for why this isn't entirely true and why `instructions-minus-irqs:u` should be preferred instead*
* `instructions-minus-irqs:u`: same as `instructions:u`, except the count of hardware interrupts ("IRQs" here for brevity) is subtracted
* *see [Caveats/Subtracting IRQs](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#Subtracting-IRQs) for why this should be preferred over `instructions:u`*
* `instructions-minus-r0420:u`: experimental counter, same as `instructions-minus-irqs:u` but subtracting an undocumented counter (`r0420:u`) instead of IRQs
* the `rXXXX` notation is again from Linux `perf`, and indicates a "raw" counter, with a hex representation of the low-level counter configuration - this was picked because we still don't *really* know what it is
* this only exists for (future) testing and isn't included/used in any comparisons/data we've put together so far
* *see [Challenges/Zen's undocumented 420 counter](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#Epilogue-Zen’s-undocumented-420-counter) for details on how this counter was found and what it does*
---
There are also some additional commits:
* ~~see [Challenges/Rebasing *shouldn't* affect the results, right?](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#Rebasing-*shouldn’t*-affect-the-results,-right) for details on the changes to `rustc_parse` and `rustc_trait_section` (the latter far more dubious, and probably shouldn't be merged, or not as-is)~~
* **EDIT**: the effects of these are no long quantifiable, the PR includes reverts for them
* ~~see [Challenges/`jemalloc`: purging will commence in ten seconds](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#jemalloc-purging-will-commence-in-ten-seconds) for details on the `jemalloc` change~~
* this is also separately found in #77162, and we probably want to avoid doing it by default, ideally we'd use the runtime control API `jemalloc` offers (assuming that can stop the timer that's already running, which I'm not sure about)
* **EDIT**: until we can do this based on `-Z` flags, this commit has also been reverted
* the `proc_macro` change was to avoid randomized hashing and therefore ASLR-like effects
---
**(much later) EDIT**: take any numbers with a grain of salt, they may have changed since initial PR open.
#### Write-up / report
Because of how extensive the full report ended up being, I've kept most of it [on `hackmd.io`](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view), but for convenient access, here are all the sections (with individual links):
<sup>(someone suggested I'd make a backup, so [here it is on the wayback machine](http://web.archive.org/web/20201127164748/https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view) - I'll need to remember to update that if I have to edit the write-up)</sup>
* [**Motivation**](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#Motivation)
* [**Results**](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#Results)
* [**Overhead**](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#Overhead)
*Preview (see the report itself for more details):*
|Counter|Total<br>`instructions-minus-irqs:u`|Overhead from "Baseline"<br>(for all 1903881<br>counter reads)|Overhead from "Baseline"<br>(per each counter read)|
|-|-|-|-|
|Baseline|63637621286 ±6||
|`instructions:u`|63658815885 ±2| +21194599 ±8| +11|
|`instructions-minus-irqs:u`|63680307361 ±13| +42686075 ±19| +22|
|`wall-time`|63951958376 ±10275|+314337090 ±10281|+165|
* [**"Macro" noise (self time)**](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#“Macro”-noise-(self-time))
*Preview (see the report itself for more details):*
|| `wall-time` (ns) | `instructions:u` | `instructions-minus-irqs:u`
-: | -: | -: | -:
`typeck` | 5478261360 ±283933373 (±~5.2%) | 17350144522 ±6392 (±~0.00004%) | 17351035832.5 ±4.5 (±~0.00000003%)
`expand_crate` | 2342096719 ±110465856 (±~4.7%) | 8263777916 ±2937 (±~0.00004%) | 8263708389 ±0 (±~0%)
`mir_borrowck` | 2216149671 ±119458444 (±~5.4%) | 8340920100 ±2794 (±~0.00003%) | 8341613983.5 ±2.5 (±~0.00000003%)
`mir_built` | 1269059734 ±91514604 (±~7.2%) | 4454959122 ±1618 (±~0.00004%) | 4455303811 ±1 (±~0.00000002%)
`resolve_crate` | 942154987.5 ±53068423.5 (±~5.6%) | 3951197709 ±39 (±~0.000001%) | 3951196865 ±0 (±~0%)
* [**"Micro" noise (individual sampling intervals)**](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#“Micro”-noise-(individual-sampling-intervals))
* [**Caveats**](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#Caveats)
* [**Disabling ASLR**](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#Disabling-ASLR)
* [**Non-deterministic proc macros**](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#Non-deterministic-proc-macros)
* [**Subtracting IRQs**](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#Subtracting-IRQs)
* [**Lack of support for multiple threads**](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#Lack-of-support-for-multiple-threads)
* [**Challenges**](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#Challenges)
* [**How do we even read hardware performance counters?**](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#How-do-we-even-read-hardware-performance-counters)
* [**ASLR: it's free entropy**](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#ASLR-it’s-free-entropy)
* [**The serializing instruction**](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#The-serializing-instruction)
* [**Getting constantly interrupted**](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#Getting-constantly-interrupted)
* [**AMD patented time-travel and dubbed it `SpecLockMap`<br><sup> or: "how we accidentally unlocked `rr` on AMD Zen"</sup>**](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#AMD-patented-time-travel-and-dubbed-it-SpecLockMapnbspnbspnbspnbspnbspnbspnbspnbspor-“how-we-accidentally-unlocked-rr-on-AMD-Zen”)
* [**`jemalloc`: purging will commence in ten seconds**](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#jemalloc-purging-will-commence-in-ten-seconds)
* [**Rebasing *shouldn't* affect the results, right?**](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#Rebasing-*shouldn’t*-affect-the-results,-right)
* [**Epilogue: Zen's undocumented 420 counter**](https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view#Epilogue-Zen’s-undocumented-420-counter)
Our condvar doesn't support setting attributes, like
pthread_condattr_setclock, which the current wait_timeout expects to
have configured.
Switch to a different implementation, following espidf.
Use `fcntl(fd, F_GETFD)` to detect if standard streams are open
In the previous implementation, if the standard streams were open,
but the RLIMIT_NOFILE value was below three, the poll would fail
with EINVAL:
> ERRORS: EINVAL The nfds value exceeds the RLIMIT_NOFILE value.
Switch to the existing fcntl based implementation to avoid the issue.
Fixes#96621.
std::io: Modify some ReadBuf method signatures to return `&mut Self`
This allows using `ReadBuf` in a builder-like style and to setup a `ReadBuf` and
pass it to `read_buf` in a single expression, e.g.,
```
// With this PR:
reader.read_buf(ReadBuf::uninit(buf).assume_init(init_len))?;
// Previously:
let mut buf = ReadBuf::uninit(buf);
buf.assume_init(init_len);
reader.read_buf(&mut buf)?;
```
r? `@sfackler`
cc https://github.com/rust-lang/rust/issues/78485, https://github.com/rust-lang/rust/issues/94741
`Mutex::lock()` and `RwLock::write()` are poison-guarded against panics,
in that they set the poison flag if a panic occurs while they're locked.
But if we're already in a panic (`thread::panicking()`), they leave the
poison flag alone.
That check is a bit of a waste for methods that never set the poison
flag though, namely `get_mut()`, `into_inner()`, and `RwLock::read()`.
These use-cases are now split to avoid that unnecessary call.
This allows to format into an `OsString` without unnecessary
allocations. E.g.
```
let mut temp_filename = path.into_os_string();
write!(&mut temp_filename, ".tmp.{}", process::id());
```
impl Read and Write for VecDeque<u8>
Implementing `Read` and `Write` for `VecDeque<u8>` fills in the VecDeque api surface where `Vec<u8>` and `Cursor<Vec<u8>>` already impl Read and Write. Not only for completeness, but VecDeque in particular is a very handy mock interface for a TCP echo service, if only it supported Read/Write.
Since this PR is just an impl trait, I don't think there is a way to limit it behind a feature flag, so it's "insta-stable". Please correct me if I'm wrong here, not trying to rush stability.
This commit adds a new unstable attribute, `#[doc(tuple_varadic)]`, that
shows a 1-tuple as `(T, ...)` instead of just `(T,)`, and links to a section
in the tuple primitive docs that talks about these.