FFmpeg/x86 at 82a68a8771ca39564f6a74e0f875d6852e7a0c2a - FFmpeg - Gitea: Git with a cup of tea

har0ke/FFmpeg

History

Lynne 82a68a8771

x86/tx_float: remove vgatherdpd usage

Its performance loss ranges from either being just as fast as individual loads
(Skylake), a few percent slower (Alderlake), 8% slower (Zen 3), to completely
disasterous (older/other CPUs).

Sadly, gathers never panned out fast on x86, even with the benefit of time and
implementation experience.

This also saves a register, as there's no need to fill out an additional
register mask.

Zen 3 (16384-point transform):
Before: 1561050 decicycles in           av_tx (fft),  131072 runs,      0 skips
After:  1449621 decicycles in           av_tx (fft),  131072 runs,      0 skips

Alderlake:
2% slower on big transforms (65536), to 1% (131072), to a few percent for smaller
sizes.

2022-05-20 10:12:34 +02:00

..

asm.h

…

bswap.h

…

cpu.c

avutil/cpu: add AVX512 Icelake flag

2022-03-10 16:45:48 -03:00

cpu.h

avutil/cpu: add AVX512 Icelake flag

2022-03-10 16:45:48 -03:00

cpuid.asm

libavutil: include assembly with full path from source root

2022-02-08 10:42:26 +01:00

emms.asm

libavutil: include assembly with full path from source root

2022-02-08 10:42:26 +01:00

emms.h

avutil/x86/emms: Don't unnecessarily include lavu/cpu.h

2022-02-21 12:37:51 +01:00

fixed_dsp_init.c

…

fixed_dsp.asm

libavutil: include assembly with full path from source root

2022-02-08 10:42:26 +01:00

float_dsp_init.c

…

float_dsp.asm

libavutil: include assembly with full path from source root

2022-02-08 10:42:26 +01:00

imgutils_init.c

Remove unnecessary libavutil/(avutil|common|internal).h inclusions

2022-02-24 12:56:49 +01:00

imgutils.asm

…

intmath.h

x86/intmath: add VEX encoded versions of av_clipf() and av_clipd()

2021-11-19 11:21:03 -03:00

intreadwrite.h

…

lls_init.c

…

lls.asm

libavutil: include assembly with full path from source root

2022-02-08 10:42:26 +01:00

Makefile

x86/tx_float: do not build tx_float_init.c if x86 assembly is disabled

2022-01-27 02:17:46 +01:00

pixelutils_init.c

…

pixelutils.asm

libavutil: include assembly with full path from source root

2022-02-08 10:42:26 +01:00

pixelutils.h

…

timer.h

…

tx_float_init.c

x86/tx_float: remove vgatherdpd usage

2022-05-20 10:12:34 +02:00

tx_float.asm

x86/tx_float: remove vgatherdpd usage

2022-05-20 10:12:34 +02:00

w64xmmtest.h

…

x86inc.asm

avutil/cpu: add AVX512 Icelake flag

2022-03-10 16:45:48 -03:00

x86util.asm

…