Lynne 35080149ef
x86/tx_float: mark AVX2 functions as AVXSLOW
Makes Bulldozer prefer AVX functions rather than AVX2,
which are 64% slower:

AVX:  117653 decicycles in av_tx (fft), 1048535 runs,     41 skips
AVX2: 193385 decicycles in av_tx (fft), 1048561 runs,     15 skips

The only difference between both is that vgatherdpd is used in
the former. We don't want to mark them with the new SLOW_GATHER
flag however, since gathers are still faster on Haswell/Zen 2/3
than plain loads.
2022-01-29 03:08:16 +01:00
..