1. 08 Aug, 2018 12 commits
  2. 07 Aug, 2018 14 commits
    • Jaikrishnan Menon's avatar
    • Jaikrishnan Menon's avatar
    • Jaikrishnan Menon's avatar
      4efcb76e
    • Nick Korovaiko's avatar
      DEX LRN (#1344) · c2e98505
      Nick Korovaiko authored
      * DEX LRN
      
      * merge after jbobba's changes
      c2e98505
    • Matthew Brookhart's avatar
      reduce fprop cache outputs (#1343) · efa2561e
      Matthew Brookhart authored
      * reduce fprop cache outputs
      
      * refactor traverse nodes
      
      * Slight refactor, add test, adress PR comments
      
      * fix formatting
      efa2561e
    • Jaikrishnan Menon's avatar
      DEX: Softmax (#1341) · f1c29c9c
      Jaikrishnan Menon authored
      * Add helper macros to select from a partial set of ranks and element types
      
      * CPU Direct Execution: Implement Softmax
      
      * Add softmax builder to the build script
      
      * Update
      f1c29c9c
    • Jaikrishnan Menon's avatar
    • dmyershov's avatar
      49d15902
    • Anna Alberska's avatar
      IntelGPU backend: And, Or operations (#1337) · 91a3bf87
      Anna Alberska authored
      * IntelGPU backend: And, Or operations
      
      * Code format update: intelgpu_backend.cpp and intelgpu_op_custom_kernels.cpp
      
      * Update logical operations
      91a3bf87
    • Fenglei's avatar
      cuda optimize softmax (#1310) · 154dc47a
      Fenglei authored
      * Updated softmax.
      
      * Formatting.
      
      * Updated convolution.
      
      * Use build_primitive overloading. Add helper to emit type_string given a node.
      
      * Formatting.
      
      * Update ConvolutionBackpropData.
      
      * convolution backprop & max pool memory primitive cacheing (#1303)
      
      * Updated ConvolutionBackpropFilters.
      * Update MaxPool.
      
      * Update Max and Min. (#1307)
      
      * softmax optimization
      
      * fix bug
      
      * fix bugs
      
      * clang format
      
      * remove comments
      
      * add softmax divide
      
      * fix bugs
      
      * fix bug
      
      * fix bug
      
      * clang format
      
      * remove unused header
      
      * register
      
      * using single parameters instead of array
      
      * using build_elementwise instead of build_elementwise_collective
      
      * remove workspace as csullivan suggested
      154dc47a
    • Anna Alberska's avatar
      IntelGPU backend: AvgPool operation(partially) (#1336) · 8db7b24b
      Anna Alberska authored
      * IntelGPU backend: AvgPool operation(partially)
      
      * Code format update intelgpu_backend.cpp
      
      * Delete code duplication in pooling ops intelgpu_backend.cpp
      8db7b24b
    • Chris Sullivan's avatar
      Auto. gen. kernel signatures and argument expansion (#1326) · 8476dea0
      Chris Sullivan authored
      * Add GPUKernelArgs for storing kernel arguments.
      
      * Formatting.
      
      * Resolve tensor addresses when extracting arg list via GPUKernelArgs.
      
      * Updated arg list resolution so that placeholder arguments can be added anywhere in the argument list.
      
      * const ref. args and changed add_args to use add_arg. also expanded type_names map.
      
      * GPUKernelArgs bug fix for return values.
      
      * add_placeholders expects pointers for later resolution
      
      * Formatting.
      
      * Add comments to GPUKernelArgs
      
      * Changed GPUKernelArgs interface to use a runtime variable number of arguments.
      
      * Removed/updated comment.
      
      * Address review comments: Remove combined address resolution and argument list retrieval. Remove unecessary extra type entries in type_map.
      
      * Add space between pragma once and includes.
      
      * Broadcast optimization (#1322)
      
      * Implement GPUKernelArgs with op::Broadcast.
      
      * Removed excess type insertion in kernel signature for broadcast impl.
      
      * Support new auto kernel signature generation for op::Broadcast. Add boolean to helpers to determine if parameters are registers or arrays.
      
      * Removed commented code.
      
      * Update broadcast impl. for new GPUKernelArgs interface.
      
      * Updated based on interface change to GPUKernelArgs.
      
      * Formatting.
      
      * CUDNNHostParameters now implement GPUHostParameters. (#1324)
      
      * Formatting.
      8476dea0
    • Jayaram Bobba's avatar
      Switch to using more expressive layout descriptors instead of numeric layout names (#1278) · 69c51c27
      Jayaram Bobba authored
      * Switch to using mkldnn memory descriptors for layout
      
      * More changes for using mkldnn descriptor instead of format
      
      * Removed mkldnn format from cpu layout descriptor. TODO - shuffle folding
      
      * Rotate mkldnn layouts on transpose
      
      * Modifications to builder reshape to skip rotated layouts
      
      * More fixes to layouts and removes axis order from cpu layout descriptor
      
      * Code cleanup
      
      * Removed shuffle folding pass since the functionality is subsumed by the layout pass
      
      * Canonicalize a few more formats to keep MKLDNN happy.
      
      * Style fixes
      
      * Style fixes
      
      * Style fixes
      
      * Addressed PR feedback and added reshape passthrough for non-transpose cases
      
      * Adjust named formats for weights tensors to keep MKLDNN happy
      
      * Style fixes
      
      * resolved merge issues
      69c51c27
    • Jaikrishnan Menon's avatar
      Fix date in license header (#1342) · 5f77fe86
      Jaikrishnan Menon authored
      5f77fe86
  3. 06 Aug, 2018 3 commits
  4. 05 Aug, 2018 4 commits
  5. 04 Aug, 2018 2 commits
  6. 03 Aug, 2018 5 commits
    • Robert Kimball's avatar
      nbench: add option to run all models in a directory (#1279) · 2b26df18
      Robert Kimball authored
      * add option to run all models in a directory
      
      * add print for exception from benchmark
      2b26df18
    • Nick Korovaiko's avatar
      11b992a7
    • Chris Sullivan's avatar
      Preallocate intermediate buffers (#1231) · 0599a628
      Chris Sullivan authored
      * Utilize GPUMemoryManager/Allocator for preallocation of intermediate tensor buffer memory.
      
      * Formatting.
      
      * Merge with master required rework of memory due to CFE pass. Moved function memory pool allocation to pass as a result.
      
      * Formatting.
      
      * Added pass source files.
      
      * Updated tests to account for new assert check. All GPUAllocators should be deconstructed before allocation is made in GPUMemoryManager.
      
      * GPUAllocator::close() can be used to close the allocator prior to destruction
      
      * Removed open allocators. Replaced check with inspection of pass::MemoryManager node list.
      
      * Formatting.
      
      * Rename m_memory_buffers -> m_tensor_memory_buffers. Use full path to static alignment variable.
      
      * FunctionMemoryReservation -> TensorMemoryReservation. Only return true in pass if reservation is made (bug fix).
      
      * Moved static compilation mutex.
      
      * Update external function with new pass name.
      
      * GPU_ExternalFunction: Add s_memory_pool_alignment, remove optimize_and_assemble method.
      0599a628
    • shssf's avatar
    • Robert Kimball's avatar
      fix travis build...I hope (#1317) · 39278e7d
      Robert Kimball authored
      39278e7d