This adds a very simple LRU-like cache which stores the locations of
often-used tags. While the implementation is very simple, the cache hit
rate is incredible at ~99.9% on most programs, and often the element at
position 0 in the cache has a hit rate of 90%. So the sub-optimality of
this cache basicaly vanishes into the noise in a profile.
Additionally, we keep a range which denotes where there might be an item
granting Unique permission in the stack, so that when we invalidate
Uniques we do not need to scan much of the stack, and often scan nothing
at all.
This PR uses the `measureme` crate to profile the call stack of the
program being interpreted by Miri. This is accomplished by starting a
measureme 'event' when we enter a function call, and ending the event
when we exit the call. The `measureme` tooling can be used to produce a
call stack from the generated profile data.
Limitations:
* We currently record every single entry/exit. This might generate very
large profile outputs for programs with a large number of function
calls. In follow-up work, we might want to explore sampling (e.g. only
recording every N function calls).
* This does not integrate very well with Miri's concurrency support.
Each event we record starts when we push a frame, and ends when we pop
a frame. As a result, switching between virtual threads will cause
events from different threads to be interleaved. Additionally, the
recorded for a particular frame will include all of the work Miri does
before that frame completes, including executing another thread.
The `measureme` integration is off by default, and must be enabled via
`-Zmiri-measureme=<output_name>`
Fixes#1057
Since we are no longer using "cargo rustc", we now use the rustc
arguments passed by Cargo to determine whether we are building a
build dependency, normal dependency, or "target" (final binary or test)
crate.