Commit f8941a12 authored by Louis Feng's avatar Louis Feng Committed by Scott Cyphers

More efficient sum for some cases (#1251)

* hacking to support dot of 3 by 2 inputs with gemm_batch.

* clean up.

* testing inplace reshape.

* fixed a compile error.

* added comments on todo.

* check for output.

* check for annotation.

* more optimizations WIP.

* sum simd.

* moved parallel for

* testing sum vectorization.

* fixed merge errors.

* sum wip.

* more logic.

* sum refactor and clean up.

* clean up.

* removed unrelated changes.

* removed related changes from merge.

* fixed clang compile errors.
parent 92adea38
......@@ -181,8 +181,9 @@ void codegen::CompilerCore::initialize()
args.push_back("-inline-threshold=1000000");
if (m_enable_pass_report)
{
args.push_back("-Rpass-analysis=loop-vectorize");
args.push_back("-Rpass=loop-vectorize");
args.push_back("-Rpass-analysis=.*");
args.push_back("-Rpass=.*");
args.push_back("-Rpass-missed=.*");
}
// Prevent Eigen from using any LGPL3 code
args.push_back("-DEIGEN_MPL2_ONLY");
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment