with_end_to_cap is enormously expensive now that it's initializing
memory since it involves 64k allocation + memset on every call. This is
most noticable when calling read_to_end on very small readers, where the
new version if **4 orders of magnitude** faster.
BufReader also depended on with_end_to_cap so I've rewritten it in its
original form.
As a bonus, converted the buffered IO struct Debug impls to use the
debug builders.
Fixes#23815