Disallow octal format in Ipv4 string
In its original specification, leading zero in Ipv4 string is interpreted
as octal literals. So a IP address 0127.0.0.1 actually means 87.0.0.1.
This confusion can lead to many security vulnerabilities. Therefore, in
[IETF RFC 6943], it suggests to disallow octal/hexadecimal format in Ipv4
string all together.
Existing implementation already disallows hexadecimal numbers. This commit
makes Parser reject octal numbers.
Fixes#83648.
[IETF RFC 6943]: https://tools.ietf.org/html/rfc6943#section-3.1.1
Improve pointer arithmetic docs
* Add slightly more detailed definition of "allocated object" to the module docs, and link it from everywhere.
* Clarify the "remains attached" wording a bit (at least I hope this is clearer).
* Remove the sentence about using integer arithmetic; this seems to confuse people even if it is technically correct.
As usual, the edit needs to be done in a dozen places to remain consistent, I hope I got them all.
Clean up Vec's benchmarks
The Vec benchmarks need a lot of love. I sort of noticed this in https://github.com/rust-lang/rust/pull/83357 but the overall situation is much less awesome than I thought at the time. The first commit just removes a lot of asserts and does a touch of other cleanup.
A number of these benchmarks are poorly-named. For example, `bench_map_fast` is not in fact fast, `bench_rev_1` and `bench_rev_2` are vague, `bench_in_place_zip_iter_mut` doesn't call `zip`, `bench_in_place*` don't do anything in-place... Should I fix these, or is there tooling that depend on the names not changing?
I've also noticed that `bench_rev_1` and `bench_rev_2` are remarkably fragile. It looks like poking other code in `Vec` can cause the codegen of this benchmark to switch to a version that has almost exactly half its current throughput and I have absolutely no idea why.
Here's the fast version:
```asm
0.69 │110: movdqu -0x20(%rbx,%rdx,4),%xmm0
1.76 │ movdqu -0x10(%rbx,%rdx,4),%xmm1
0.71 │ pshufd $0x1b,%xmm1,%xmm1
0.60 │ pshufd $0x1b,%xmm0,%xmm0
3.68 │ movdqu %xmm1,-0x30(%rcx)
14.36 │ movdqu %xmm0,-0x20(%rcx)
13.88 │ movdqu -0x40(%rbx,%rdx,4),%xmm0
6.64 │ movdqu -0x30(%rbx,%rdx,4),%xmm1
0.76 │ pshufd $0x1b,%xmm1,%xmm1
0.77 │ pshufd $0x1b,%xmm0,%xmm0
1.87 │ movdqu %xmm1,-0x10(%rcx)
13.01 │ movdqu %xmm0,(%rcx)
38.81 │ add $0x40,%rcx
0.92 │ add $0xfffffffffffffff0,%rdx
1.22 │ ↑ jne 110
```
And the slow one:
```asm
0.42 │9a880: movdqa %xmm2,%xmm1
4.03 │9a884: movq -0x8(%rbx,%rsi,4),%xmm4
8.49 │9a88a: pshufd $0xe1,%xmm4,%xmm4
2.58 │9a88f: movq -0x10(%rbx,%rsi,4),%xmm5
7.02 │9a895: pshufd $0xe1,%xmm5,%xmm5
4.79 │9a89a: punpcklqdq %xmm5,%xmm4
5.77 │9a89e: movdqu %xmm4,-0x18(%rdx)
15.74 │9a8a3: movq -0x18(%rbx,%rsi,4),%xmm4
3.91 │9a8a9: pshufd $0xe1,%xmm4,%xmm4
5.04 │9a8ae: movq -0x20(%rbx,%rsi,4),%xmm5
5.29 │9a8b4: pshufd $0xe1,%xmm5,%xmm5
4.60 │9a8b9: punpcklqdq %xmm5,%xmm4
9.81 │9a8bd: movdqu %xmm4,-0x8(%rdx)
11.05 │9a8c2: paddq %xmm3,%xmm0
0.86 │9a8c6: paddq %xmm3,%xmm2
5.89 │9a8ca: add $0x20,%rdx
0.12 │9a8ce: add $0xfffffffffffffff8,%rsi
1.16 │9a8d2: add $0x2,%rdi
2.96 │9a8d6: → jne 9a880 <<alloc::vec::Vec<T,A> as core::iter::traits::collect::Extend<&T>>::extend+0xd0>
```
In its original specification, leading zero in Ipv4 string is interpreted
as octal literals. So a IP address 0127.0.0.1 actually means 87.0.0.1.
This confusion can lead to many security vulnerabilities. Therefore, in
[IETF RFC 6943], it suggests to disallow octal/hexadecimal format in Ipv4
string all together.
Existing implementation already disallows hexadecimal numbers. This commit
makes Parser reject octal numbers.
Fixes#83648.
[IETF RFC 6943]: https://tools.ietf.org/html/rfc6943#section-3.1.1
unix: Fix feature(unix_socket_ancillary_data) on macos and other BSDs
This adds support for CMSG handling on macOS and fixes it on OpenBSD and possibly other BSDs.
When traversing the CMSG list, the previous code had an exception for Android where the next element after the last pointer could point to the first pointer instead of NULL. This is actually not specific to Android: the `libc::CMSG_NXTHDR` implementation for Linux and emscripten have a special case to return NULL when the length of the previous element is zero; most other implementations simply return the previous element plus a zero offset in this case.
This MR makes the check non-optional which fixes CMSG handling and a possible endless loop on such systems; tested with file descriptor passing on OpenBSD, Linux, and macOS.
This MR additionally adds `SocketAncillary::is_empty` because clippy is right that it should be added.
This belongs to the `feature(unix_socket_ancillary_data)` tracking issue: https://github.com/rust-lang/rust/issues/76915
r? `@joshtriplett`
escape_ascii take 2
The previous PR, #73111 was closed for inactivity; since I've had trouble in the past reopening closed PRs, I'm just making a new one.
I'm still running the tests locally but figured I'd open the PR in the meantime. Will fix whatever errors show up so we don't have to wait again for this.
r? ``@m-ou-se``
alloc: Added `as_slice` method to `BinaryHeap` collection
I initially asked about whether it is useful addition on https://internals.rust-lang.org/t/should-i-add-as-slice-method-to-binaryheap/13816, and it seems there were no objections, so went ahead with this PR.
> There is [`BinaryHeap::into_vec`](https://doc.rust-lang.org/std/collections/struct.BinaryHeap.html#method.into_vec), but it consumes the value. I wonder if there is API design limitation that should be taken into account. Implementation-wise, the inner buffer is just a Vec, so it is trivial to expose as_slice from it.
Please, guide me through if I need to add tests or something else.
UPD: Tracking issue #83659
may not -> might not
may not -> might not
"may not" has two possible meanings:
1. A command: "You may not stay up past your bedtime."
2. A fact that's only sometimes true: "Some cities may not have bike lanes."
In some cases, the meaning is ambiguous: "Some cars may not have snow
tires." (do the cars *happen* to not have snow tires, or is it
physically impossible for them to have snow tires?)
This changes places where the standard library uses the "description of
fact" meaning to say "might not" instead.
This is just `std::vec` for now - if you think this is a good idea I can
convert the rest of the standard library.
Adjust documentation links for slice::make_ascii_*case
The documentation for the functions `slice::to_ascii_lowercase` and `slice::to_ascii_uppercase` contain the suggestion
> To lowercase the value in-place, use `make_ascii_lowercase`
however the link to the suggested method takes you to the page for `u8`, rather than the method of that name on the same page.
Instruct LLVM that binary_search returns a valid index
This allows removing bound checks when the return value of `binary_search` is used to index into the slice it was call on. I also added a codegen test for this, not sure if it's the right thing to do (I didn't find anything on the dev guide), but it felt so.
"may not" has two possible meanings:
1. A command: "You may not stay up past your bedtime."
2. A fact that's only sometimes true: "Some cities may not have bike lanes."
In some cases, the meaning is ambiguous: "Some cars may not have snow
tires." (do the cars *happen* to not have snow tires, or is it
physically impossible for them to have snow tires?)
This changes places where the standard library uses the "description of
fact" meaning to say "might not" instead.
This is just `std::vec` for now - if you think this is a good idea I can
convert the rest of the standard library.
Improve fs error open_from unix
Consistency for #79399
Suggested by JohnTitor
r? `@JohnTitor`
Not user if the error is too long now, do we handle long errors well?
Add function core::iter::zip
This makes it a little easier to `zip` iterators:
```rust
for (x, y) in zip(xs, ys) {}
// vs.
for (x, y) in xs.into_iter().zip(ys) {}
```
You can `zip(&mut xs, &ys)` for the conventional `iter_mut()` and
`iter()`, respectively. This can also support arbitrary nesting, where
it's easier to see the item layout than with arbitrary `zip` chains:
```rust
for ((x, y), z) in zip(zip(xs, ys), zs) {}
for (x, (y, z)) in zip(xs, zip(ys, zs)) {}
// vs.
for ((x, y), z) in xs.into_iter().zip(ys).zip(xz) {}
for (x, (y, z)) in xs.into_iter().zip((ys.into_iter().zip(xz)) {}
```
It may also format more nicely, especially when the first iterator is a
longer chain of methods -- for example:
```rust
iter::zip(
trait_ref.substs.types().skip(1),
impl_trait_ref.substs.types().skip(1),
)
// vs.
trait_ref
.substs
.types()
.skip(1)
.zip(impl_trait_ref.substs.types().skip(1))
```
This replaces the tuple-pair `IntoIterator` in #78204.
There is prior art for the utility of this in [`itertools::zip`].
[`itertools::zip`]: https://docs.rs/itertools/0.10.0/itertools/fn.zip.html
update array missing `IntoIterator` msg
fixes#82602
r? ```@estebank``` do you know whether we can use the expr span in `rustc_on_unimplemented`? The label isn't too great rn
make unaligned_references future-incompat lint warn-by-default
and also remove the safe_packed_borrows lint that it replaces.
`std::ptr::addr_of!` has hit beta now and will hit stable in a month, so I propose we start fixing https://github.com/rust-lang/rust/issues/27060 for real: creating a reference to a field of a packed struct needs to eventually become a hard error; this PR makes it a warn-by-default future-incompat lint. (The lint already existed, this just raises its default level.) At the same time I removed the corresponding code from unsafety checking; really there's no reason an `unsafe` block should make any difference here.
For references to packed fields outside `unsafe` blocks, this means `unaligned_refereces` replaces the previous `safe_packed_borrows` warning with a link to https://github.com/rust-lang/rust/issues/82523 (and no more talk about unsafe blocks making any difference). So behavior barely changes, the warning is just worded differently. For references to packed fields inside `unsafe` blocks, this PR shows a new future-incompat warning.
Closes https://github.com/rust-lang/rust/issues/46043 because that lint no longer exists.
Improve Debug implementations of Mutex and RwLock.
This improves the Debug implementations of Mutex and RwLock.
They now show the poison flag and use debug_non_exhaustive. (See #67364.)
Derive Debug for io::Chain instead of manually implementing it.
This derives Debug for io::Chain instead of manually implementing it.
The manual implementation has the same bounds, so I don't think there's any reason for a manual implementation. The names used in the derive implementation are even nicer (`first`/`second`) than the manual implementation (`t`/`u`), and include the `done_first` field too.
Fix Debug implementation for RwLock{Read,Write}Guard.
This would attempt to print the Debug representation of the lock that the guard has locked, which will try to lock again, fail, and just print `"<locked>"` unhelpfully.
After this change, this just prints the contents of the mutex, like the other smart pointers (and MutexGuard) do.
MutexGuard had this problem too: https://github.com/rust-lang/rust/issues/57702
ExitStatus: print "exit status: {}" rather than "exit code: {}" on unix
Proper Unix terminology is "exit status" (vs "wait status"). "exit
code" is imprecise on Unix and therefore unclear. (As far as I can
tell, "exit code" is correct terminology on Windows.)
This new wording is unfortunately inconsistent with the identifier
names in the Rust stdlib.
It is the identifier names that are wrong, as discussed at length in eg
https://doc.rust-lang.org/nightly/std/process/struct.ExitStatus.htmlhttps://doc.rust-lang.org/nightly/std/os/unix/process/trait.ExitStatusExt.html
Unfortunately for API stability reasons it would be a lot of work, and
a lot of disruption, to change the names in the stdlib (eg to rename
`std::process::ExitStatus` to `std::process::ChildStatus` or
something), but we should fix the message output. Many (probably
most) readers of these messages about exit statuses will be users and
system administrators, not programmers, who won't even know that Rust
has this wrong terminology.
So I think the right thing is to fix the documentation (as I have
already done) and, now, the terminology in the implementation.
This is a user-visible change to the behaviour of all Rust programs
which run Unix subprocesses. Hopefully no-one is matching against the
exit status string, except perhaps in tests.