mikros/rust - rust - Gitea.pterpstra.com

Author	SHA1	Message	Date
Yuki Okushi	2c3a8cf0a4	Rollup merge of #97611 - azdavis:master, r=Dylan-DPC Tweak insert docs For `{Hash, BTree}Map::insert`, I always have to take a few extra seconds to think about the slight weirdness about the fact that if we "did not" insert (which "sounds" false), we return true, and if we "did" insert, (which "sounds" true), we return false. This tweaks the doc comments for the `insert` methods of those types (as well as what looks like a rustc internal data structure that I found just by searching the codebase for "If the set did") to first use the "Returns whether _something_" pattern used in e.g. `remove`, where we say that `remove` "returns whether the value was present".	2022-06-01 23:36:52 +09:00
Ariel Davis	b02146a370	Tweak insert docs	2022-05-31 22:08:14 -07:00
bors	395a09c3da	Auto merge of #97553 - nbdd0121:lib, r=Mark-Simulacrum Add `#[inline]` to `Vec`'s `Deref/DerefMut` This should help #97552 (although I haven't verified).	2022-06-01 04:52:11 +00:00
Matthias Krüger	0d1e5465f3	Rollup merge of #97578 - ojeda:checkpatch, r=JohnTitor alloc: remove repeated word in comment Linux's `checkpatch.pl` reports: ```txt #42544: FILE: rust/alloc/vec/mod.rs:2692: WARNING: Possible repeated word: 'to' + // - Elements are :Copy so it's OK to to copy them, without doing ``` Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2022-05-31 23:11:35 +02:00
Matthias Krüger	4f4a819fa9	Rollup merge of #97316 - CAD97:bound-misbehavior, r=dtolnay Put a bound on collection misbehavior As currently written, when a logic error occurs in a collection's trait parameters, this allows completely arbitrary misbehavior, so long as it does not cause undefined behavior in std. However, because the extent of misbehavior is not specified, it is allowed for any code in std to start misbehaving in arbitrary ways which are not formally UB; consider the theoretical example of a global which gets set on an observed logic error. Because the misbehavior is only bound by not resulting in UB from safe APIs and the crate-level encapsulation boundary of all of std, this makes writing user unsafe code that utilizes std theoretically impossible, as it now relies on undocumented QOI (quality of implementation) that unrelated parts of std cannot be caused to misbehave by a misuse of std::collections APIs. In practice, this is a nonconcern, because std has reasonable QOI and an implementation that takes advantage of this freedom is essentially a malicious implementation and only compliant by the most langauage-lawyer reading of the documentation. To close this hole, we just add a small clause to the existing logic error paragraph that ensures that any misbehavior is limited to the collection which observed the logic error, making it more plausible to prove the soundness of user unsafe code. This is not meant to be formal; a formal refinement would likely need to mention that values derived from the collection can also misbehave after a logic error is observed, as well as define what it means to "observe" a logic error in the first place. This fix errs on the side of informality in order to close the hole without complicating a normal reading which can assume a reasonable nonmalicious QOI. See also [discussion on IRLO][1]. [1]: https://internals.rust-lang.org/t/using-std-collections-and-unsafe-anything-can-happen/16640 r? rust-lang/libs-api ```@rustbot``` label +T-libs-api -T-libs This technically adds a new guarantee to the documentation, though I argue as written it's one already implicitly provided.	2022-05-31 23:11:34 +02:00
bors	0a43923a86	Auto merge of #97419 - WaffleLapkin:const_from_ptr_range, r=oli-obk Make `from{,_mut}_ptr_range` const This PR makes the following APIs `const`: ```rust // core::slice pub const unsafe fn from_ptr_range<'a, T>(range: Range<const T>) -> &'a [T]; pub const unsafe fn from_mut_ptr_range<'a, T>(range: Range<mut T>) -> &'a mut [T]; ``` Tracking issue: #89792. Feature for `from_ptr_range` as a `const fn`: `slice_from_ptr_range_const`. Feature for `from_mut_ptr_range` as a `const fn`: `slice_from_mut_ptr_range_const`. r? `@oli-obk`	2022-05-31 14:55:33 +00:00
bors	16a0d03698	Auto merge of #97521 - SkiFire13:clarify-vec-as-ptr, r=Dylan-DPC Clarify the guarantees of Vec::as_ptr and Vec::as_mut_ptr when there's no allocation Currently the documentation says they return a pointer to the vector's buffer, which has the implied precondition that the vector allocated some memory. However `Vec`'s documentation also specifies that it won't always allocate, so it's unclear whether the pointer returned is valid in that case. Of course you won't be able to read/write actual bytes to/from it since the capacity is 0, but there's an exception: zero sized read/writes. They are still valid as long as the pointer is not null and the memory it points to wasn't deallocated, but `Vec::as_ptr` and `Vec::as_mut_ptr` don't specify that's not the case. This PR thus specifies they are actually valid for zero sized reads since `Vec` is implemented to hold a dangling pointer in those cases, which is neither null nor was deallocated.	2022-05-31 12:14:51 +00:00
Miguel Ojeda	5dae6c1b96	alloc: remove repeated word in comment Linux's `checkpatch.pl` reports: ```txt #42544: FILE: rust/alloc/vec/mod.rs:2692: WARNING: Possible repeated word: 'to' + // - Elements are :Copy so it's OK to to copy them, without doing ``` Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2022-05-31 12:33:31 +02:00
Dylan DPC	bf248c82e8	Rollup merge of #97455 - JohnTitor:stabilize-toowned-clone-into, r=dtolnay Stabilize `toowned_clone_into` Closes #41263 FCP has been done: https://github.com/rust-lang/rust/issues/41263#issuecomment-1100760750	2022-05-31 07:57:35 +02:00
Dylan DPC	9c72f16b9f	Rollup merge of #97229 - Nilstrieb:doc-box-noalias, r=dtolnay Document the current aliasing rules for `Box<T>`. Currently, `Box<T>` gets `noalias`, meaning it has the same rules as `&mut T`. This is sparsely documented, even though it can have quite a big impact on unsafe code using box. Therefore, these rules are documented here, with a big warning that they are not normative and subject to change, since we have not yet committed to an aliasing model and the state of `Box<T>` is especially uncertain. If you have any suggestions and improvements, make sure to leave them here. This is mostly intended to inform people about what is currently going on (to prevent misunderstandings such as [Jon Gjengset's Box aliasing](https://www.youtube.com/watch?v=EY7Wi9fV5bk)). This is supposed to _only document current UB_ and not add any new guarantees or rules.	2022-05-31 07:57:33 +02:00
David Tolnay	e6b1003c95	BTreeSet->BTreeMap (fix copy/paste mistake in documentation) Co-authored-by: lcnr <rust@lcnr.de>	2022-05-30 17:56:35 -07:00
David Tolnay	ffd7f5873e	Fix typo uniqeness -> uniqueness	2022-05-30 16:49:28 -07:00
Michael Goulet	3c0b9d50ae	Rollup merge of #89685 - DeveloperC286:iter_fields_to_private, r=oli-obk refactor: VecDeques Iter fields to private Made the fields of VecDeque's Iter private by creating a Iter::new(...) function to create a new instance of Iter and migrating usage to use Iter::new(...).	2022-05-30 15:57:27 -07:00
bors	4a8d2e3856	Auto merge of #97480 - conradludgate:faster-format-literals, r=joshtriplett improve format impl for literals The basic idea of this change can be seen here https://godbolt.org/z/MT37cWoe1. Updates the format impl to have a fast path for string literals and the default path for regular format args. This change will allow `format!("string literal")` to be used interchangably with `"string literal".to_owned()`. This would be relevant in the case of `f!"string literal"` being legal (https://github.com/rust-lang/rfcs/pull/3267) in which case it would be the easiest way to create owned strings from literals, while also being just as efficient as any other impl	2022-05-30 17:39:58 +00:00
Gary Guo	0a7a0ff4d9	Add `#[inline]` to `Vec`'s `Deref/DerefMut`	2022-05-30 15:11:53 +01:00
Dylan DPC	0ed320bdb9	Rollup merge of #97494 - est31:remove_box_alloc_tests, r=Dylan-DPC Use Box::new() instead of box syntax in library tests The tests inside `library/*` have no reason to use `box` syntax as they have 0 performance relevance. Therefore, we can safely remove them (instead of having to use alternatives like the one in #97293).	2022-05-30 14:33:48 +02:00
Maybe Waffle	ff9efd8a55	Add reexport of slice::from{,_mut}_ptr_range to alloc & std At first I was confused why `std::slice::from_ptr_range` didn't work :D	2022-05-30 15:44:56 +04:00
Conrad Ludgate	5dd0fe301a	remove useless cold	2022-05-29 20:40:56 +01:00
Conrad Ludgate	3f404bfa86	improve format impl for literals	2022-05-29 20:40:56 +01:00
bors	bef2b7cd1c	Auto merge of #97214 - Mark-Simulacrum:stage0-bump, r=pietroalbini Finish bumping stage0 It looks like the last time had left some remaining cfg's -- which made me think that the stage0 bump was actually successful. This brings us to a released 1.62 beta though. This now brings us to cfg-clean, with the exception of check-cfg-features in bootstrap; I'd prefer to leave that for a separate PR at this time since it's likely to be more tricky. cc https://github.com/rust-lang/rust/pull/97147#issuecomment-1132845061 r? `@pietroalbini`	2022-05-29 16:28:21 +00:00
Giacomo Stevanato	8ef2dd70e6	Clarify the guarantees of Vec::as_ptr and Vec::as_mut_ptr when there's no allocation	2022-05-29 17:43:35 +02:00
est31	7230a15c32	Use Box::new() instead of box syntax in alloc tests	2022-05-29 00:41:14 +02:00
Matthias Krüger	4254f922db	Rollup merge of #95214 - tbu-:pr_vec_append_doc, r=Mark-Simulacrum Remove impossible panic note from `Vec::append` Neither the number of elements in a vector can overflow a `usize`, nor can the amount of elements in two vectors.	2022-05-28 01:11:46 +02:00
Yuki Okushi	846f134cd3	Stabilize `toowned_clone_into`	2022-05-28 01:07:45 +09:00
Mark Rousskov	b454991ac4	Finish bumping stage0 It looks like the last time had left some remaining cfg's -- which made me think that the stage0 bump was actually successful. This brings us to a released 1.62 beta though.	2022-05-27 07:36:17 -04:00
Nilstrieb	e7c468dc59	Document the current aliasing rules for `Box<T>`. Currently, `Box<T>` gets `noalias`, meaning it has the same rules as `&mut T`. This is sparsely documented, even though it can have quite a big impact on unsafe code using box. Therefore, these rules are documented here, with a big warning that they are not normative and subject to change, since we have not yet committed to an aliasing model and the state of `Box<T>` is especially uncertain.	2022-05-26 21:28:07 +02:00
bors	1851f0802e	Auto merge of #97046 - conradludgate:faster-ascii-case-conv-path, r=thomcc improve case conversion happy path Someone shared the source code for [Go's string case conversion](`19156a5474/src/strings/strings.go (L558-L616)`). It features a hot path for ascii-only strings (although I assume for reasons specific to go, they've opted for a read safe hot loop). I've borrowed these ideas and also kept our existing code to provide a fast path + seamless utf-8 correct path fallback. (Naive) Benchmarks can be found here https://github.com/conradludgate/case-conv For the cases where non-ascii is found near the start, the performance of this algorithm does fall back to original speeds and has not had any measurable speed loss	2022-05-26 15:29:01 +00:00
Conrad Ludgate	d0f9930709	improve case conversion happy path	2022-05-26 13:18:57 +01:00
Christopher Durham	67aca498c6	Put a bound on collection misbehavior As currently written, when a logic error occurs in a collection's trait parameters, this allows completely arbitrary misbehavior, so long as it does not cause undefined behavior in std. However, because the extent of misbehavior is not specified, it is allowed for any code in std to start misbehaving in arbitrary ways which are not formally UB; consider the theoretical example of a global which gets set on an observed logic error. Because the misbehavior is only bound by not resulting in UB from safe APIs and the crate-level encapsulation boundary of all of std, this makes writing user unsafe code that utilizes std theoretically impossible, as it now relies on undocumented QOI that unrelated parts of std cannot be caused to misbehave by a misuse of std::collections APIs. In practice, this is a nonconcern, because std has reasonable QOI and an implementation that takes advantage of this freedom is essentially a malicious implementation and only compliant by the most langauage-lawyer reading of the documentation. To close this hole, we just add a small clause to the existing logic error paragraph that ensures that any misbehavior is limited to the collection which observed the logic error, making it more plausible to prove the soundness of user unsafe code. This is not meant to be formal; a formal refinement would likely need to mention that values derived from the collection can also misbehave after a logic error is observed, as well as define what it means to "observe" a logic error in the first place. This fix errs on the side of informality in order to close the hole without complicating a normal reading which can assume a reasonable nonmalicious QOI. See also [discussion on IRLO][1]. [1]: https://internals.rust-lang.org/t/using-std-collections-and-unsafe-anything-can-happen/16640	2022-05-23 09:20:57 -05:00
Dylan DPC	e5cf3cb97d	Rollup merge of #97087 - Nilstrieb:clarify-slice-iteration-order, r=dtolnay Clarify slice and Vec iteration order While already being inferable from the doc examples, it wasn't fully specified. This is the only logical way to do a slice iterator, so I think this should be uncontroversial. It also improves the `Vec::into_iter` example to better show the order and that the iterator returns owned values.	2022-05-23 07:43:49 +02:00
bors	4a86c7907b	Auto merge of #96605 - Urgau:string-retain-codegen, r=thomcc Improve codegen of String::retain method This pull-request improve the codegen of the `String::retain` method. Using `unwrap_unchecked` helps the optimizer to not generate a panicking path that will never be taken for valid UTF-8 like string. Using `encode_utf8` saves us from an expensive call to `memcpy`, as the optimizer is unable to realize that `ch_len <= 4` and so can generate much better assembly code. https://rust.godbolt.org/z/z73ohenfc	2022-05-21 01:56:51 +00:00
ajtribick	1a41a665cf	Reverse condition in Vec::retain_mut doctest	2022-05-19 20:54:16 +02:00
bors	50872bdb99	Auto merge of #97033 - nbdd0121:unwind3, r=Amanieu Remove libstd's calls to `C-unwind` foreign functions Remove all libstd and its dependencies' usage of `extern "C-unwind"`. This is a prerequiste of a WIP PR which will forbid libraries calling `extern "C-unwind"` functions to be compiled in `-Cpanic=unwind` and linked against `panic_abort` (this restriction is necessary to address soundness bug #96926). Cargo will ensure all crates are compiled with the same `-Cpanic` but the std is only compiled `-Cpanic=unwind` but needs the ability to be linked into `-Cpanic=abort`. Currently there are two places where `C-unwind` is used in libstd: * `__rust_start_panic` is used for interfacing to the panic runtime. This could be `extern "Rust"` * `_{rdl,rg}_oom`: a shim `__rust_alloc_error_handler` will be generated by codegen to call into one of these; they can also be `extern "Rust"` (in fact, the generated shim is used as `extern "Rust"`, so I am not even sure why these are not, probably because they used to `extern "C"` and was changed to `extern "C-unwind"` when we allow alloc error hooks to unwind, but they really should just be using Rust ABI). For dependencies, there is only one `extern "C-unwind"` function call, in `unwind` crate. This can be expressed as a re-export. More dicussions can be seen in the Zulip thread: https://rust-lang.zulipchat.com/#narrow/stream/210922-project-ffi-unwind/topic/soundness.20in.20mixed.20panic.20mode `@rustbot` label: T-libs F-c_unwind	2022-05-19 04:04:40 +00:00
Nilstrieb	4a2214885d	Clarify slice and Vec iteration order While already being inferable from the doc examples, it wasn't fully specified. This is the only logical way to do a slice iterator.	2022-05-16 19:29:45 +02:00
Yuki Okushi	6c6958b531	Rollup merge of #95365 - mkroening:hermit-alloc-error-handler, r=joshtriplett Use default alloc_error_handler for hermit Hermit now properly separates kernel from userspace. Applications for hermit can now use Rust's default `alloc_error_handler` instead of calling the kernel's `__rg_oom`. CC: ``@stlankes``	2022-05-14 13:42:49 +09:00
Gary Guo	68f063bf3f	Use Rust ABI for `__rust_start_panic` and `_{rdl,rg}_oom`	2022-05-14 02:53:59 +01:00
Matthias Krüger	a56211a44e	Rollup merge of #97003 - nnethercote:rm-const_fn-attrs, r=fee1-dead Remove some unnecessary `rustc_allow_const_fn_unstable` attributes. r? `@fee1-dead`	2022-05-13 16:03:25 +02:00
Nicholas Nethercote	fd01fbc058	Remove some unnecessary `rustc_allow_const_fn_unstable` attributes.	2022-05-13 16:01:18 +10:00
bors	1d2ea98cff	Auto merge of #95837 - scottmcm:ptr-offset-from-unsigned, r=oli-obk Add `sub_ptr` on pointers (the `usize` version of `offset_from`) We have `add`/`sub` which are the `usize` versions of `offset`, this adds the `usize` equivalent of `offset_from`. Like how `.add(d)` replaced a whole bunch of `.offset(d as isize)`, you can see from the changes here that it's fairly common that code actually knows the order between the pointers and wants a `usize`, not an `isize`. As a bonus, this can do `sub nuw`+`udiv exact`, rather than `sub`+`sdiv exact`, which can be optimized slightly better because it doesn't have to worry about negatives. That's why the slice iterators weren't using `offset_from`, though I haven't updated that code in this PR because slices are so perf-critical that I'll do it as its own change. This is an intrinsic, like `offset_from`, so that it can eventually be allowed in CTFE. It also allows checking the extra safety condition -- see the test confirming that CTFE catches it if you pass the pointers in the wrong order.	2022-05-12 02:49:00 +00:00
Scott McMurray	003b954a43	Apply CR suggestions; add real tracking issue	2022-05-11 17:16:25 -07:00
Scott McMurray	e76b3f3b5b	Rename `unsigned_offset_from` to `sub_ptr`	2022-05-11 17:16:25 -07:00
Scott McMurray	89a18cb600	Add `unsigned_offset_from` on pointers Like we have `add`/`sub` which are the `usize` version of `offset`, this adds the `usize` equivalent of `offset_from`. Like how `.add(d)` replaced a whole bunch of `.offset(d as isize)`, you can see from the changes here that it's fairly common that code actually knows the order between the pointers and wants a `usize`, not an `isize`. As a bonus, this can do `sub nuw`+`udiv exact`, rather than `sub`+`sdiv exact`, which can be optimized slightly better because it doesn't have to worry about negatives. That's why the slice iterators weren't using `offset_from`, though I haven't updated that code in this PR because slices are so perf-critical that I'll do it as its own change. This is an intrinsic, like `offset_from`, so that it can eventually be allowed in CTFE. It also allows checking the extra safety condition -- see the test confirming that CTFE catches it if you pass the pointers in the wrong order.	2022-05-11 17:16:25 -07:00
bors	0cd939e36c	Auto merge of #96150 - est31:unused_macro_rules, r=petrochenkov Implement a lint to warn about unused macro rules This implements a new lint to warn about unused macro rules (arms/matchers), similar to the `unused_macros` lint added by #41907 that warns about entire macros. ```rust macro_rules! unused_empty { (hello) => { println!("Hello, world!") }; () => { println!("empty") }; //~ ERROR: 1st rule of macro `unused_empty` is never used } fn main() { unused_empty!(hello); } ``` Builds upon #96149 and #96156. Fixes #73576	2022-05-12 00:08:08 +00:00
Matthias Krüger	6c8001b85c	Rollup merge of #96008 - fmease:warn-on-useless-doc-hidden-on-assoc-impl-items, r=lcnr Warn on unused `#[doc(hidden)]` attributes on trait impl items [Zulip conversation](https://rust-lang.zulipchat.com/#narrow/stream/266220-rustdoc/topic/.E2.9C.94.20Validy.20checks.20for.20.60.23.5Bdoc.28hidden.29.5D.60). Whether an associated item in a trait impl is shown or hidden in the documentation entirely depends on the corresponding item in the trait declaration. Rustdoc completely ignores `#[doc(hidden)]` attributes on impl items. No error or warning is emitted: ```rust pub trait Tr { fn f(); } pub struct Ty; impl Tr for Ty { #[doc(hidden)] fn f() {} } // ^^^^^^^^^^^^^^ ignored by rustdoc and currently // no error or warning issued ``` This may lead users to the wrong belief that the attribute has an effect. In fact, several such cases are found in the standard library (I've removed all of them in this PR). There does not seem to exist any incentive to allow this in the future either: Impl'ing a trait for a type means the type fully conforms to its API. Users can add `#[doc(hidden)]` to the whole impl if they want to hide the implementation or add the attribute to the corresponding associated item in the trait declaration to hide the specific item. Hiding an implementation of an associated item does not make much sense: The associated item can still be found on the trait page. This PR emits the warn-by-default lint `unused_attribute` for this case with a future-incompat warning. `@rustbot` label T-compiler T-rustdoc A-lint	2022-05-09 18:45:36 +02:00
bors	8a2fe75d0e	Auto merge of #95960 - jhpratt:remove-rustc_deprecated, r=compiler-errors Remove `#[rustc_deprecated]` This removes `#[rustc_deprecated]` and introduces diagnostics to help users to the right direction (that being `#[deprecated]`). All uses of `#[rustc_deprecated]` have been converted. CI is expected to fail initially; this requires #95958, which includes converting `stdarch`. I plan on following up in a short while (maybe a bootstrap cycle?) removing the diagnostics, as they're only intended to be short-term.	2022-05-09 04:47:30 +00:00
León Orell Valerian Liehr	9d157ada35	Warn on unused doc(hidden) on trait impl items	2022-05-08 22:53:14 +02:00
bors	e209e85e39	Auto merge of #95183 - ibraheemdev:arc-count-acquire, r=Amanieu Weaken needlessly restrictive orderings on `Arc::*_count` There is no apparent reason for these to be `SeqCst`. For reference, [the Boost C++ implementation relies on acquire semantics](`f2cc84a23c/include/boost/smart_ptr/detail/sp_counted_base_std_atomic.hpp (L137-L140)`).	2022-05-06 14:53:24 +00:00
bors	8c4fc9d9a4	Auto merge of #94598 - scottmcm:prefix-free-hasher-methods, r=Amanieu Add a dedicated length-prefixing method to `Hasher` This accomplishes two main goals: - Make it clear who is responsible for prefix-freedom, including how they should do it - Make it feasible for a `Hasher` that doesn't care about Hash-DoS resistance to get better performance by not hashing lengths This does not change rustc-hash, since that's in an external crate, but that could potentially use it in future. Fixes #94026 r? rust-lang/libs --- The core of this change is the following two new methods on `Hasher`: ```rust pub trait Hasher { /// Writes a length prefix into this hasher, as part of being prefix-free. /// /// If you're implementing [`Hash`] for a custom collection, call this before /// writing its contents to this `Hasher`. That way /// `(collection![1, 2, 3], collection![4, 5])` and /// `(collection![1, 2], collection![3, 4, 5])` will provide different /// sequences of values to the `Hasher` /// /// The `impl<T> Hash for [T]` includes a call to this method, so if you're /// hashing a slice (or array or vector) via its `Hash::hash` method, /// you should not call this yourself. /// /// This method is only for providing domain separation. If you want to /// hash a `usize` that represents part of the data, then it's important /// that you pass it to [`Hasher::write_usize`] instead of to this method. /// /// # Examples /// /// ``` /// #![feature(hasher_prefixfree_extras)] /// # // Stubs to make the `impl` below pass the compiler /// # struct MyCollection<T>(Option<T>); /// # impl<T> MyCollection<T> { /// # fn len(&self) -> usize { todo!() } /// # } /// # impl<'a, T> IntoIterator for &'a MyCollection<T> { /// # type Item = T; /// # type IntoIter = std::iter::Empty<T>; /// # fn into_iter(self) -> Self::IntoIter { todo!() } /// # } /// /// use std:#️⃣:{Hash, Hasher}; /// impl<T: Hash> Hash for MyCollection<T> { /// fn hash<H: Hasher>(&self, state: &mut H) { /// state.write_length_prefix(self.len()); /// for elt in self { /// elt.hash(state); /// } /// } /// } /// ``` /// /// # Note to Implementers /// /// If you've decided that your `Hasher` is willing to be susceptible to /// Hash-DoS attacks, then you might consider skipping hashing some or all /// of the `len` provided in the name of increased performance. #[inline] #[unstable(feature = "hasher_prefixfree_extras", issue = "88888888")] fn write_length_prefix(&mut self, len: usize) { self.write_usize(len); } /// Writes a single `str` into this hasher. /// /// If you're implementing [`Hash`], you generally do not need to call this, /// as the `impl Hash for str` does, so you can just use that. /// /// This includes the domain separator for prefix-freedom, so you should /// not call `Self::write_length_prefix` before calling this. /// /// # Note to Implementers /// /// The default implementation of this method includes a call to /// [`Self::write_length_prefix`], so if your implementation of `Hasher` /// doesn't care about prefix-freedom and you've thus overridden /// that method to do nothing, there's no need to override this one. /// /// This method is available to be overridden separately from the others /// as `str` being UTF-8 means that it never contains `0xFF` bytes, which /// can be used to provide prefix-freedom cheaper than hashing a length. /// /// For example, if your `Hasher` works byte-by-byte (perhaps by accumulating /// them into a buffer), then you can hash the bytes of the `str` followed /// by a single `0xFF` byte. /// /// If your `Hasher` works in chunks, you can also do this by being careful /// about how you pad partial chunks. If the chunks are padded with `0x00` /// bytes then just hashing an extra `0xFF` byte doesn't necessarily /// provide prefix-freedom, as `"ab"` and `"ab\u{0}"` would likely hash /// the same sequence of chunks. But if you pad with `0xFF` bytes instead, /// ensuring at least one padding byte, then it can often provide /// prefix-freedom cheaper than hashing the length would. #[inline] #[unstable(feature = "hasher_prefixfree_extras", issue = "88888888")] fn write_str(&mut self, s: &str) { self.write_length_prefix(s.len()); self.write(s.as_bytes()); } } ``` With updates to the `Hash` implementations for slices and containers to call `write_length_prefix` instead of `write_usize`. `write_str` defaults to using `write_length_prefix` since, as was pointed out in the issue, the `write_u8(0xFF)` approach is insufficient for hashers that work in chunks, as those would hash `"a\u{0}"` and `"a"` to the same thing. But since `SipHash` works byte-wise (there's an internal buffer to accumulate bytes until a full chunk is available) it overrides `write_str` to continue to use the add-non-UTF-8-byte approach. --- Compatibility: Because the default implementation of `write_length_prefix` calls `write_usize`, the changed hash implementation for slices will do the same thing the old one did on existing `Hasher`s.	2022-05-06 09:43:57 +00:00
Scott McMurray	98054377ee	Add a dedicated length-prefixing method to `Hasher` This accomplishes two main goals: - Make it clear who is responsible for prefix-freedom, including how they should do it - Make it feasible for a `Hasher` that doesn't care about Hash-DoS resistance to get better performance by not hashing lengths This does not change rustc-hash, since that's in an external crate, but that could potentially use it in future.	2022-05-06 00:03:38 -07:00
est31	5646e9a172	Allow unused rules in some places in the compiler, library and tools	2022-05-05 19:13:00 +02:00

1 2 3 4 5 ...

1670 Commits