-
Sayed Adel authored
- improve cpu dispatching calls to allow more SIMD extentions (SSE4.1, AVX2, VSX) - wide universal intrinsics - replace dummy v_expand with v_expand_low - replace v_expand + v_mul_wrap with v_mul_expand for product accumulate operations - use FMA for accumulate operations - add mask and more types to accumulate's performance tests
8965f3ae
Name |
Last commit
|
Last update |
---|---|---|
.github | ||
3rdparty | ||
apps | ||
cmake | ||
data | ||
doc | ||
include | ||
modules | ||
platforms | ||
samples | ||
.gitattributes | ||
.gitignore | ||
CMakeLists.txt | ||
CONTRIBUTING.md | ||
LICENSE | ||
README.md |