22 Commits

Author SHA1 Message Date
Mark Rousskov
f1bcb0f3af Delete Decoder::read_unit 2022-02-22 18:14:51 -05:00
Mark Rousskov
2098ea6eba Provide copy-free access to raw Decoder bytes 2022-02-22 18:11:59 -05:00
Mark Rousskov
da3b2ca956 Provide raw &str access from Decoder 2022-02-22 18:05:51 -05:00
bjorn3
0b8f3729fb Remove two unnecessary transmutes from opaque Encoder and Decoder 2022-01-31 18:25:05 +01:00
Nicholas Nethercote
37fbd91eb5 Address review comments. 2022-01-22 10:38:34 +11:00
Nicholas Nethercote
416399dc10 Make Decodable and Decoder infallible.
`Decoder` has two impls:
- opaque: this impl is already partly infallible, i.e. in some places it
  currently panics on failure (e.g. if the input is too short, or on a
  bad `Result` discriminant), and in some places it returns an error
  (e.g. on a bad `Option` discriminant). The number of places where
  either happens is surprisingly small, just because the binary
  representation has very little redundancy and a lot of input reading
  can occur even on malformed data.
- json: this impl is fully fallible, but it's only used (a) for the
  `.rlink` file production, and there's a `FIXME` comment suggesting it
  should change to a binary format, and (b) in a few tests in
  non-fundamental ways. Indeed #85993 is open to remove it entirely.

And the top-level places in the compiler that call into decoding just
abort on error anyway. So the fallibility is providing little value, and
getting rid of it leads to some non-trivial performance improvements.

Much of this commit is pretty boring and mechanical. Some notes about
a few interesting parts:
- The commit removes `Decoder::{Error,error}`.
- `InternIteratorElement::intern_with`: the impl for `T` now has the same
  optimization for small counts that the impl for `Result<T, E>` has,
  because it's now much hotter.
- Decodable impls for SmallVec, LinkedList, VecDeque now all use
  `collect`, which is nice; the one for `Vec` uses unsafe code, because
  that gave better perf on some benchmarks.
2022-01-22 10:38:31 +11:00
Nicholas Nethercote
88600a6d7f Rename Decoder::read_nil and read_unit.
Because `()` is called "unit" and it makes it match
`Encoder::emit_unit`.
2022-01-22 10:22:24 +11:00
bors
38c22af015 Auto merge of #92604 - nnethercote:optimize-impl_read_unsigned_leb128, r=michaelwoerister
Optimize `impl_read_unsigned_leb128`

I see instruction count improvements of up to 3.5% locally with these changes, mostly on the smaller benchmarks.

r? `@michaelwoerister`
2022-01-15 07:27:30 +00:00
Nicholas Nethercote
5f549d9b49 Modify the buffer position directly when reading leb128 values.
It's a small but clear performance win.
2022-01-07 10:40:51 +11:00
Jakub Beránek
6c9ffe4aec
Do not use LEB128 for encoding u16 and i16 2021-12-28 09:29:08 +01:00
The 8472
c640f31c9f avoid string validation in rustc_serialize, check a marker byte instead
since the serialization format isn't self-describing we need a way to detect
when encoder and decoder don't match up. but that doesn't have to
be utf8 validation for strings, which does cost a few % of performance.
Instead we can use a marker byte at the end to be reasonably
sure that we're dealing with a string and it wasn't overwritten in some
way.
2021-12-06 18:43:01 +01:00
Michael Woerister
517d5ac230 Allow for reading raw bytes from rustc_serialize::Decoder without unsafe code. 2021-03-25 14:05:00 +01:00
Camille GILLOT
09a638820e Move raw bytes handling to Encoder/Decoder. 2021-03-19 19:35:22 +01:00
Camille GILLOT
e5d09fbbe9 Simplify IntEncodedWithFixedSize. 2021-03-18 20:09:00 +01:00
Camille GILLOT
5003b3dc31 Move IntEncodedWithFixedSize to rustc_serialize. 2021-03-18 20:09:00 +01:00
Tyson Nottingham
f15fae822e rustc_serialize: fix incorrect signed LEB128 decoding
The signed LEB128 decoding function used a hardcoded constant of 64
instead of the number of bits in the type of integer being decoded,
which resulted in incorrect results for some inputs. Fix this, make the
decoding more consistent with the unsigned version, and increase the
LEB128 encoding and decoding test coverage.
2021-01-11 12:13:26 -08:00
Tyson Nottingham
52f21791fb Serialize incr comp structures to file via fixed-size buffer
Reduce a large memory spike that happens during serialization by writing
the incr comp structures to file by way of a fixed-size buffer, rather
than an unbounded vector.

Effort was made to keep the instruction count close to that of the
previous implementation. However, buffered writing to a file inherently
has more overhead than writing to a vector, because each write may
result in a handleable error. To reduce this overhead, arrangements are
made so that each LEB128-encoded integer can be written to the buffer
with only one capacity and error check. Higher-level optimizations in
which entire composite structures can be written with one capacity and
error check are possible, but would require much more work.

The performance is mostly on par with the previous implementation, with
small to moderate instruction count regressions. The memory reduction is
significant, however, so it seems like a worth-while trade-off.
2021-01-11 12:13:22 -08:00
Tyson Nottingham
be79f493fb rustc_serialize: specialize opaque decoding of some u8 sequences 2021-01-01 22:49:16 -08:00
Tyson Nottingham
7c6274d464 rustc_serialize: have read_raw_bytes take MaybeUninit<u8> slice 2021-01-01 22:49:16 -08:00
Tyson Nottingham
a4daa63a90 rustc_serialize: specialize opaque encoding of some u8 sequences 2021-01-01 22:49:14 -08:00
est31
a0fc455d30 Replace absolute paths with relative ones
Modern compilers allow reaching external crates
like std or core via relative paths in modules
outside of lib.rs and main.rs.
2020-10-13 14:16:45 +02:00
mark
9e5f7d5631 mv compiler to compiler/ 2020-08-30 18:45:07 +03:00