Add `f16` and `f128` inline ASM support for `x86` and `x86-64`
This PR adds `f16` and `f128` input and output support to inline ASM on `x86` and `x86-64`. `f16` vector sizes are taken from [here](https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html).
Relevant issue: #125398
Tracking issue: #116909
``@rustbot`` label +F-f16_and_f128