66e1f11ef4
llvm is currently not able to conver `Vec::extend` into a memcpy for `Copy` types, which results in methods like `Vec::push_all` to run twice as slow as it should be running. This patch takes the unsafe `Vec::clone` optimization to speed up all the operations that are cloning a slice into a `Vec`. before: ``` test vec::tests::bench_clone_from_0000_0000 ... bench: 12 ns/iter (+/- 2) test vec::tests::bench_clone_from_0000_0010 ... bench: 125 ns/iter (+/- 4) = 80 MB/s test vec::tests::bench_clone_from_0000_0100 ... bench: 360 ns/iter (+/- 33) = 277 MB/s test vec::tests::bench_clone_from_0000_1000 ... bench: 2601 ns/iter (+/- 175) = 384 MB/s test vec::tests::bench_clone_from_0010_0000 ... bench: 12 ns/iter (+/- 2) test vec::tests::bench_clone_from_0010_0010 ... bench: 125 ns/iter (+/- 10) = 80 MB/s test vec::tests::bench_clone_from_0010_0100 ... bench: 361 ns/iter (+/- 28) = 277 MB/s test vec::tests::bench_clone_from_0100_0010 ... bench: 131 ns/iter (+/- 13) = 76 MB/s test vec::tests::bench_clone_from_0100_0100 ... bench: 360 ns/iter (+/- 9) = 277 MB/s test vec::tests::bench_clone_from_0100_1000 ... bench: 2575 ns/iter (+/- 168) = 388 MB/s test vec::tests::bench_clone_from_1000_0100 ... bench: 356 ns/iter (+/- 20) = 280 MB/s test vec::tests::bench_clone_from_1000_1000 ... bench: 2605 ns/iter (+/- 167) = 383 MB/s test vec::tests::bench_from_slice_0000 ... bench: 11 ns/iter (+/- 0) test vec::tests::bench_from_slice_0010 ... bench: 115 ns/iter (+/- 5) = 86 MB/s test vec::tests::bench_from_slice_0100 ... bench: 309 ns/iter (+/- 170) = 323 MB/s test vec::tests::bench_from_slice_1000 ... bench: 2065 ns/iter (+/- 198) = 484 MB/s test vec::tests::bench_push_all_0000_0000 ... bench: 7 ns/iter (+/- 0) test vec::tests::bench_push_all_0000_0010 ... bench: 79 ns/iter (+/- 7) = 126 MB/s test vec::tests::bench_push_all_0000_0100 ... bench: 342 ns/iter (+/- 18) = 292 MB/s test vec::tests::bench_push_all_0000_1000 ... bench: 2873 ns/iter (+/- 75) = 348 MB/s test vec::tests::bench_push_all_0010_0010 ... bench: 154 ns/iter (+/- 8) = 64 MB/s test vec::tests::bench_push_all_0100_0100 ... bench: 518 ns/iter (+/- 18) = 193 MB/s test vec::tests::bench_push_all_1000_1000 ... bench: 4490 ns/iter (+/- 223) = 222 MB/s ``` after: ``` test vec::tests::bench_clone_from_0000_0000 ... bench: 12 ns/iter (+/- 1) test vec::tests::bench_clone_from_0000_0010 ... bench: 123 ns/iter (+/- 5) = 81 MB/s test vec::tests::bench_clone_from_0000_0100 ... bench: 367 ns/iter (+/- 23) = 272 MB/s test vec::tests::bench_clone_from_0000_1000 ... bench: 2618 ns/iter (+/- 252) = 381 MB/s test vec::tests::bench_clone_from_0010_0000 ... bench: 12 ns/iter (+/- 1) test vec::tests::bench_clone_from_0010_0010 ... bench: 124 ns/iter (+/- 7) = 80 MB/s test vec::tests::bench_clone_from_0010_0100 ... bench: 369 ns/iter (+/- 34) = 271 MB/s test vec::tests::bench_clone_from_0100_0010 ... bench: 123 ns/iter (+/- 6) = 81 MB/s test vec::tests::bench_clone_from_0100_0100 ... bench: 371 ns/iter (+/- 25) = 269 MB/s test vec::tests::bench_clone_from_0100_1000 ... bench: 2713 ns/iter (+/- 532) = 368 MB/s test vec::tests::bench_clone_from_1000_0100 ... bench: 369 ns/iter (+/- 14) = 271 MB/s test vec::tests::bench_clone_from_1000_1000 ... bench: 2611 ns/iter (+/- 194) = 382 MB/s test vec::tests::bench_from_slice_0000 ... bench: 7 ns/iter (+/- 0) test vec::tests::bench_from_slice_0010 ... bench: 108 ns/iter (+/- 4) = 92 MB/s test vec::tests::bench_from_slice_0100 ... bench: 235 ns/iter (+/- 24) = 425 MB/s test vec::tests::bench_from_slice_1000 ... bench: 1318 ns/iter (+/- 96) = 758 MB/s test vec::tests::bench_push_all_0000_0000 ... bench: 7 ns/iter (+/- 0) test vec::tests::bench_push_all_0000_0010 ... bench: 70 ns/iter (+/- 4) = 142 MB/s test vec::tests::bench_push_all_0000_0100 ... bench: 176 ns/iter (+/- 16) = 568 MB/s test vec::tests::bench_push_all_0000_1000 ... bench: 1125 ns/iter (+/- 94) = 888 MB/s test vec::tests::bench_push_all_0010_0010 ... bench: 159 ns/iter (+/- 15) = 62 MB/s test vec::tests::bench_push_all_0100_0100 ... bench: 363 ns/iter (+/- 12) = 275 MB/s test vec::tests::bench_push_all_1000_1000 ... bench: 2860 ns/iter (+/- 415) = 349 MB/s ``` This also includes extra benchmarks for `Vec` and `MemWriter`.