1. 07 Aug, 2018 3 commits
    • Chris Sullivan's avatar
      Auto. gen. kernel signatures and argument expansion (#1326) · 8476dea0
      Chris Sullivan authored
      * Add GPUKernelArgs for storing kernel arguments.
      
      * Formatting.
      
      * Resolve tensor addresses when extracting arg list via GPUKernelArgs.
      
      * Updated arg list resolution so that placeholder arguments can be added anywhere in the argument list.
      
      * const ref. args and changed add_args to use add_arg. also expanded type_names map.
      
      * GPUKernelArgs bug fix for return values.
      
      * add_placeholders expects pointers for later resolution
      
      * Formatting.
      
      * Add comments to GPUKernelArgs
      
      * Changed GPUKernelArgs interface to use a runtime variable number of arguments.
      
      * Removed/updated comment.
      
      * Address review comments: Remove combined address resolution and argument list retrieval. Remove unecessary extra type entries in type_map.
      
      * Add space between pragma once and includes.
      
      * Broadcast optimization (#1322)
      
      * Implement GPUKernelArgs with op::Broadcast.
      
      * Removed excess type insertion in kernel signature for broadcast impl.
      
      * Support new auto kernel signature generation for op::Broadcast. Add boolean to helpers to determine if parameters are registers or arrays.
      
      * Removed commented code.
      
      * Update broadcast impl. for new GPUKernelArgs interface.
      
      * Updated based on interface change to GPUKernelArgs.
      
      * Formatting.
      
      * CUDNNHostParameters now implement GPUHostParameters. (#1324)
      
      * Formatting.
      8476dea0
    • Jayaram Bobba's avatar
      Switch to using more expressive layout descriptors instead of numeric layout names (#1278) · 69c51c27
      Jayaram Bobba authored
      * Switch to using mkldnn memory descriptors for layout
      
      * More changes for using mkldnn descriptor instead of format
      
      * Removed mkldnn format from cpu layout descriptor. TODO - shuffle folding
      
      * Rotate mkldnn layouts on transpose
      
      * Modifications to builder reshape to skip rotated layouts
      
      * More fixes to layouts and removes axis order from cpu layout descriptor
      
      * Code cleanup
      
      * Removed shuffle folding pass since the functionality is subsumed by the layout pass
      
      * Canonicalize a few more formats to keep MKLDNN happy.
      
      * Style fixes
      
      * Style fixes
      
      * Style fixes
      
      * Addressed PR feedback and added reshape passthrough for non-transpose cases
      
      * Adjust named formats for weights tensors to keep MKLDNN happy
      
      * Style fixes
      
      * resolved merge issues
      69c51c27
    • Jaikrishnan Menon's avatar
      Fix date in license header (#1342) · 5f77fe86
      Jaikrishnan Menon authored
      5f77fe86
  2. 06 Aug, 2018 3 commits
  3. 05 Aug, 2018 4 commits
  4. 04 Aug, 2018 2 commits
  5. 03 Aug, 2018 15 commits
  6. 02 Aug, 2018 12 commits
  7. 01 Aug, 2018 1 commit
    • Louis Feng's avatar
      More efficient sum for some cases (#1251) · f8941a12
      Louis Feng authored
      * hacking to support dot of 3 by 2 inputs with gemm_batch.
      
      * clean up.
      
      * testing inplace reshape.
      
      * fixed a compile error.
      
      * added comments on todo.
      
      * check for output.
      
      * check for annotation.
      
      * more optimizations WIP.
      
      * sum simd.
      
      * moved parallel for
      
      * testing sum vectorization.
      
      * fixed merge errors.
      
      * sum wip.
      
      * more logic.
      
      * sum refactor and clean up.
      
      * clean up.
      
      * removed unrelated changes.
      
      * removed related changes from merge.
      
      * fixed clang compile errors.
      f8941a12