Turn off optimizations on emitted external function. (#1592)
Clang chooses to use a __vectorcall optimization in which address pointers are vector loaded in the gpu::invoke_primitive. This results in a segfault when stack alignment is absent. Since the GPU transformer does not rely on CPU for compute, we disable the optimizations of the emitted function.
Showing
Please
register
or
sign in
to comment