1. 14 Mar, 2018 4 commits
  2. 13 Mar, 2018 13 commits
  3. 12 Mar, 2018 7 commits
  4. 11 Mar, 2018 6 commits
  5. 10 Mar, 2018 9 commits
  6. 09 Mar, 2018 1 commit
    • Chris Sullivan's avatar
      Adding support for GPU elementwise ops for arbitrarily many inputs (#618) · 89da71d3
      Chris Sullivan authored
      * Refactored unary elementwise ops into a single interface
      that is adaptable to elementwise ops with arbitrary number of inputs.
      
      * Renamed EmitUnaryElementwise -> EmitElementwise.
      Implemented first binary elementwise op (Power).
      
      * Refactored some of the boiler plate code for emitting cuda kernels to nvrtc
      out of the emit functions and into the CudaFunctionPool static singleton.
      CodeWriter now saves cuda kernels to ./gpu_codegen.
      
      * Added ops Divide, Subtract & Sign to the GPU transformer.
      Subtract and Sign both use custom device helper functions which
      have math kernels defined for the op in gpu_cuda_kernel_ops.hpp,
      and which are built by a new get_device_helper function.
      89da71d3