James Almer 
							
						 
					 
					
						
						
						
						
							
						
						
							d5b3077ecf 
							
						 
					 
					
						
						
							
							x86/pixelutils: add missing preprocessor wrapper to the AVX2 functions  
						
						... 
						
						
						
						Should fix compilation with old yasm/nasm
Signed-off-by: James Almer <jamrial@gmail.com> 
						
						
					 
					
						2018-07-31 22:14:42 -03:00 
						 
				 
			
				
					
						
							
							
								Jun Zhao 
							
						 
					 
					
						
						
						
						
							
						
						
							d36b8394f4 
							
						 
					 
					
						
						
							
							avutil/pixelutils: sad_32x32 sse2/avx2 optimizations.  
						
						... 
						
						
						
						add ff_pixelutils_sad_32x32_sse2, ff_pixelutils_sad_{a,u}_32x32_sse2,
ff_pixelutils_sad_32x32_avx22, ff_pixelutils_sad_{a,u}_32x32_avx2
use perf record/report profiling, get instructions:u for avx2 sad_32x32:
  72.05%  pixelutils  pixelutils     [.] block_sad_32x32_c
  18.50%  pixelutils  pixelutils     [.] block_sad_16x16_c
   4.78%  pixelutils  pixelutils     [.] block_sad_8x8_c
   2.69%  pixelutils  pixelutils     [.] block_sad_4x4_c
   0.89%  pixelutils  pixelutils     [.] block_sad_2x2_c
   0.16%  pixelutils  pixelutils     [.] ff_pixelutils_sad_32x32_avx2
   0.16%  pixelutils  pixelutils     [.] ff_pixelutils_sad_u_32x32_avx2
   0.12%  pixelutils  pixelutils     [.] ff_pixelutils_sad_a_32x32_avx2
sse2 sad_32x32 instructions:u like:
  71.86%  pixelutils  pixelutils     [.] block_sad_32x32_c
  18.42%  pixelutils  pixelutils     [.] block_sad_16x16_c
   4.81%  pixelutils  pixelutils     [.] block_sad_8x8_c
   2.68%  pixelutils  pixelutils     [.] block_sad_4x4_c
   0.88%  pixelutils  pixelutils     [.] block_sad_2x2_c
   0.29%  pixelutils  pixelutils     [.] ff_pixelutils_sad_32x32_sse2
   0.26%  pixelutils  pixelutils     [.] ff_pixelutils_sad_u_32x32_sse2
   0.23%  pixelutils  pixelutils     [.] ff_pixelutils_sad_a_32x32_sse2
Signed-off-by: Jun Zhao <mypopydev@gmail.com> 
						
						
					 
					
						2018-07-31 19:17:51 +08:00 
						 
				 
			
				
					
						
							
							
								Jun Zhao 
							
						 
					 
					
						
						
						
						
							
						
						
							09628cb1b4 
							
						 
					 
					
						
						
							
							avutil/pixelutils: correct the function name in comments  
						
						... 
						
						
						
						Signed-off-by: Jun Zhao <mypopydev@gmail.com> 
						
						
					 
					
						2018-07-11 20:12:33 +08:00 
						 
				 
			
				
					
						
							
							
								Henrik Gramner 
							
						 
					 
					
						
						
						
						
							
						
						
							f0b7882ceb 
							
						 
					 
					
						
						
							
							x86inc: Drop SECTION_TEXT macro  
						
						... 
						
						
						
						The .text section is already 16-byte aligned by default on all supported
platforms so `SECTION_TEXT` isn't any different from `SECTION .text`. 
						
						
					 
					
						2015-08-04 20:13:09 +02:00 
						 
				 
			
				
					
						
							
							
								Clément Bœsch 
							
						 
					 
					
						
						
						
						
							
						
						
							554d819062 
							
						 
					 
					
						
						
							
							avutil/pixelutils: faster pixelutils_sad_16x16  
						
						... 
						
						
						
						501 to 439 decicycles.
See 45c7f3997ea11c3d1007b2126b1c0049a8c27105. 
						
						
					 
					
						2014-08-23 20:12:56 +02:00 
						 
				 
			
				
					
						
							
							
								Clément Bœsch 
							
						 
					 
					
						
						
						
						
							
						
						
							45c7f3997e 
							
						 
					 
					
						
						
							
							avutil/pixelutils: faster pixelutils_sad_[au]_16x16  
						
						... 
						
						
						
						~560 → ~500 decicycles
This is following the comments from Michael in
https://ffmpeg.org/pipermail/ffmpeg-devel/2014-August/160599.html 
Using 2 registers for accumulator didn't help. On the other hand,
some re-ordering between the movs and psadbw allowed going ~538 to ~500. 
						
						
					 
					
						2014-08-23 10:18:53 +02:00 
						 
				 
			
				
					
						
							
							
								Clément Bœsch 
							
						 
					 
					
						
						
						
						
							
						
						
							28a2107a8d 
							
						 
					 
					
						
						
							
							avutil: add pixelutils API  
						
						
						
						
					 
					
						2014-08-05 21:05:52 +02:00