-
Chris Sullivan authored
Clang chooses to use a __vectorcall optimization in which address pointers are vector loaded in the gpu::invoke_primitive. This results in a segfault when stack alignment is absent. Since the GPU transformer does not rely on CPU for compute, we disable the optimizations of the emitted function.
b0e4d8cb
Name |
Last commit
|
Last update |
---|---|---|
.ci | ||
cmake | ||
contrib/docker | ||
doc | ||
licenses | ||
maint | ||
python | ||
src | ||
test | ||
.clang-format | ||
.gitignore | ||
.gitmodules | ||
.travis.yml | ||
CMakeLists.txt | ||
CONTRIB.md | ||
INSTALL.md | ||
LICENSE | ||
README.md | ||
VERSION.in | ||
changes.md |