Commit b0e4d8cb authored by Chris Sullivan's avatar Chris Sullivan Committed by Robert Kimball

Turn off optimizations on emitted external function. (#1592)

Clang chooses to use a __vectorcall optimization in which
address pointers are vector loaded in the gpu::invoke_primitive.
This results in a segfault when stack alignment is absent.
Since the GPU transformer does not rely on CPU for compute,
we disable the optimizations of the emitted function.
parent 73bff556
...@@ -497,7 +497,7 @@ void runtime::gpu::GPU_ExternalFunction::emit_functions() ...@@ -497,7 +497,7 @@ void runtime::gpu::GPU_ExternalFunction::emit_functions()
m_writer << "extern \"C\" void " << current_function->get_name(); m_writer << "extern \"C\" void " << current_function->get_name();
m_writer << "(void** inputs, void** outputs, " m_writer << "(void** inputs, void** outputs, "
<< "gpu::GPURuntimeContext* ctx)\n"; << "gpu::GPURuntimeContext* ctx) __attribute__ ((optnone))\n";
m_writer.block_begin(); m_writer.block_begin();
{ {
m_writer << "m_runtime_context = ctx;\n"; m_writer << "m_runtime_context = ctx;\n";
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment