Commits · 2938860b3f292e600cfb7404faa78660ca86516c · submodule / opencv

26 Dec, 2017 1 commit

Provide a few AVX512 optimized functions for the DNN module · 2938860b

Arjan van de Ven authored Dec 25, 2017

This patch adds AVX512 optimized fastConv as well as the hookups
needed to get these called in the convolution_layer.

AVX512 fastConv is code-identical on a C level to the AVX2 one,
but is measurably faster due to AVX512 having more registers available
to cache results in.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>

2938860b

25 Dec, 2017 1 commit

Add basic plumbing for AVX512 support · fc8e848a

Arjan van de Ven authored Dec 25, 2017

The opencv infrastructure mostly has the basics for supporting avx512 math functions,
but it wasn't hooked up (likely due to lack of users)

In order to compile the DNN functions for AVX512, a few things need to be hooked up
and this patch does that
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>

fc8e848a

22 Dec, 2017 29 commits
- Merge tag '3.4.0' · 047764f4
  Alexander Alekhin authored Dec 22, 2017
  
  047764f4
- OpenCV version++ · 6d4f6647
  Alexander Alekhin authored Dec 22, 2017
```
3.4.0
```
  6d4f6647
- Merge pull request #10398 from alalek:ml_simplify_simulated_annealing · eba176c2
  Alexander Alekhin authored Dec 22, 2017
  
  eba176c2
- ml(ANN_MLP): ensure that train() call is always successful · 00e43a90
  Alexander Alekhin authored Dec 22, 2017
  
  00e43a90
- Merge pull request #10401 from terfendail:resize_linear_revert · 9148a376
  Alexander Alekhin authored Dec 22, 2017
  
  9148a376
- Merge pull request #10402 from dkurt:dnn_tf_quantized · 019b7c5a
  Alexander Alekhin authored Dec 22, 2017
  
  019b7c5a
- Merge pull request #10385 from pengli:dnn · 59e825ee
  Alexander Alekhin authored Dec 22, 2017
  
  59e825ee
- TensorFlow weights dequantization · bcc669f3
  Dmitry Kurtaev authored Dec 21, 2017
  
  bcc669f3
- Reverted calls to linear resize back to generic version for floating point matrices · e5313246
  Vitaly Tuzov authored Dec 22, 2017
  
  e5313246
- Merge pull request #10397 from mshabunin:fix-incorrect-assert · 97af6080
  Alexander Alekhin authored Dec 22, 2017
  
  97af6080
- add one more convolution kernel tuning candidate · 181b448c
  Li Peng authored Dec 22, 2017
```
Signed-off-by: Li Peng <peng.li@intel.com>
```
  181b448c
- ml: simplify interfaces of SimulatedAnnealingSolver · 289a8da3
  Alexander Alekhin authored Dec 22, 2017
  
  289a8da3
- Merge pull request #10396 from berak:fix_superres_sample · b85c7728
  Vadim Pisarevsky authored Dec 22, 2017
  
  b85c7728
- Merge pull request #10265 from dkurt:nms_for_region_layer · 0742e12f
  Vadim Pisarevsky authored Dec 22, 2017
  
  0742e12f
- Merge pull request #10394 from alalek:cmake_fix_pch_pic_pie · 9b659736
  Vadim Pisarevsky authored Dec 22, 2017
  
  9b659736
- Merge pull request #10387 from terfendail:resize23_perftest · 69a6765b
  Vadim Pisarevsky authored Dec 22, 2017
  
  69a6765b
- Merge pull request #10392 from terfendail:bitexact_fallback · 3f68d6d8
  Vadim Pisarevsky authored Dec 22, 2017
  
  3f68d6d8
- Merge pull request #10375 from tomoaki0705:buildWarningMSVC · 83b8cd01
  Alexander Alekhin authored Dec 22, 2017
  
  83b8cd01
- Merge pull request #10390 from alalek:ocl_option_buffer_rect · 4e542a65
  Alexander Alekhin authored Dec 22, 2017
  
  4e542a65
- Replaced incorrect CV_Assert calls with CV_Error · aa46e31c
  Maksim Shabunin authored Dec 22, 2017
  
  aa46e31c
- samples: check for valid input in gpu/super_resolution.cpp · ddbd0746
  berak authored Dec 22, 2017
  
  ddbd0746
- Added fallback to generic linear resize in case bit-exact resize of provided matrix isn't supported · 5fdb42a7
  Vitaly Tuzov authored Dec 22, 2017
  
  5fdb42a7
- cmake: fix -fPIC/-fPIE handling in precompiled headers (PCH) · 6c252d8c
  Alexander Alekhin authored Dec 22, 2017
  
  6c252d8c
- Merge pull request #10386 from terfendail:resizeexact_c3 · 636b7ec0
  Vadim Pisarevsky authored Dec 22, 2017
  
  636b7ec0
- ocl: workaround option to disable usage of buffer "Rect" operations · 534645a1
  Alexander Alekhin authored Dec 22, 2017
  
  534645a1
- Merge pull request #10389 from wxzs5:yangli · 09c84a01
  Alexander Alekhin authored Dec 22, 2017
  
  09c84a01
- Merge pull request #10364 from dkurt:dnn_smooth_tf_data_layout · 325cbd7c
  Vadim Pisarevsky authored Dec 22, 2017
  
  325cbd7c
- Remove redundant return variable · b19cd937
  wxzs5 authored Dec 22, 2017
  
  b19cd937
- Disabled universal intrinsic based implementation for bit-exact resize of 3-channel images · 01916248
  Vitaly Tuzov authored Dec 22, 2017
  
  01916248
21 Dec, 2017 9 commits
- clean up the code · fe7b3f12
  Tomoaki Teshima authored Dec 21, 2017
```
  * disable the warning in CMake, not int the code using pragma
```
  fe7b3f12
- Merge pull request #10374 from tomoaki0705:removeGstreamerTest · 1bc1f3d3
  Alexander Alekhin authored Dec 21, 2017
  
  1bc1f3d3
- Merge pull request #10316 from terfendail:bitexact_c234 · a8a51db4
  Vadim Pisarevsky authored Dec 21, 2017
  
  a8a51db4
- Merge pull request #10369 from alalek:issue_10351 · 70d49446
  Vadim Pisarevsky authored Dec 21, 2017
  
  70d49446
- Merge pull request #10370 from pengli:dnn · a2620f72
  Alexander Alekhin authored Dec 21, 2017
  
  a2620f72
- avoid the test which is too strict · 50d44e06
  Tomoaki Teshima authored Dec 21, 2017
```
  * confirmed test failure on Jetson TX1 and TX2
  * show the performance but not bit exact result
```
  50d44e06
- cleanup unnecessary macros in convolution ocl kernel · c5fc8e03
  Li Peng authored Dec 21, 2017
```
Signed-off-by: Li Peng <peng.li@intel.com>
```
  c5fc8e03
- refactor candidate generation of convolution auto-tuning · 0aa5e43a
  Li Peng authored Dec 19, 2017
```
Signed-off-by: Li Peng <peng.li@intel.com>
```
  0aa5e43a
- Refactor NMS procedure at RegionLayer · c67e75b6
  Dmitry Kurtaev authored Dec 08, 2017
  
  c67e75b6