Rollup merge of #89114 - dequbed:c-char, r=yaahc
Fixes a technicality regarding the size of C's `char` type Specifically, ISO/IEC 9899:2018 — better known as "C18" — (and at least C11, C99 and C89) do not specify the size of `byte` in bits. Section 3.6 defines "byte" as "addressable unit of data storage" while section 6.2.5 ("Types") only defines "char" as "large enough to store any member of the basic execution set" giving it a lower bound of 7 bit (since there are 96 characters in the basic execution set). With section 6.5.3.4 paragraph 4 "When sizeof is applied to an operant that has type char […] the result is 1" you could read this as the size of `char` in bits being defined as exactly the same as the number of bits in a byte but it's also valid to read that as an exception. In general implementations take `char` as the smallest unit of addressable memory, which for modern byte-addressed architectures is overwhelmingly 8 bits to the point of this convention being completely cemented into just about all of our software. So is any of this actually relevant at all? I hope not. I sincerely hope that this never, ever comes up. But if for some reason a poor rustacean is having to interface with C code running on a Cray X1 that in 2003 is still doing word-addressed memory with 64-bit chars and they trust the docs here blindly it will blow up in her face. And I'll be truly sorry for her to have to deal with … all of that.
This commit is contained in:
commit
8a6e9cf074
@ -1,6 +1,6 @@
|
||||
Equivalent to C's `char` type.
|
||||
|
||||
[C's `char` type] is completely unlike [Rust's `char` type]; while Rust's type represents a unicode scalar value, C's `char` type is just an ordinary integer. This type will always be either [`i8`] or [`u8`], as the type is defined as being one byte long.
|
||||
[C's `char` type] is completely unlike [Rust's `char` type]; while Rust's type represents a unicode scalar value, C's `char` type is just an ordinary integer. On modern architectures this type will always be either [`i8`] or [`u8`], as they use byte-addresses memory with 8-bit bytes.
|
||||
|
||||
C chars are most commonly used to make C strings. Unlike Rust, where the length of a string is included alongside the string, C strings mark the end of a string with the character `'\0'`. See [`CStr`] for more information.
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user