Björn Steinbrink 36dccec2f3 Currently, LLVM lowers a cttz8 on x86_64 to these instructions:
```asm
    movzbl      %dil, %eax
    bsfl        %eax, %eax
    movl        $32, %ecx
    cmovnel     %eax, %ecx
    cmpl        $32, %ecx
    movl        $8, %eax
    cmovnel     %ecx, %eax
```

which has some unnecessary overhead, having two conditional moves.

To improve the codegen, we can zero extend the 8 bit integer, then set
bit 8 and perform a cttz operation on the extended value. That way
there's no conditional operation involved at all.
2015-04-29 14:45:23 +02:00
..