In local benchmarks this results in 0.4% fewer cycles in a critical sequential section when compiling libcore.
sha1
sha2
md5