Avoid many `cmt` allocations.
`cmt` is a ref-counted wrapper around `cmt_` The use of refcounting
keeps `cmt` handling simple, but a lot of `cmt` instances are very
short-lived, and heap-allocating the short-lived ones takes up time.
This patch changes things in the following ways.
- Most of the functions that produced `cmt` instances now produce `cmt_`
instances. The `Rc::new` calls that occurred within those functions
now occur at their call sites (but only when necessary, which isn't
that often).
- Many of the functions that took `cmt` arguments now take `&cmt_`
arguments. This includes all the methods in the `Delegate` trait.
As a result, the vast majority of the heap allocations are avoided. In
an extreme case, the number of calls to malloc in tuple-stress drops
from 9.9M to 7.9M, a drop of 20%. And the compile times for many runs of
coercions, deep-vector, and tuple-stress drop by 1--2%.