1. 07 Aug, 2018 2 commits
    • Matthew Brookhart's avatar
      reduce fprop cache outputs (#1343) · efa2561e
      Matthew Brookhart authored
      * reduce fprop cache outputs
      
      * refactor traverse nodes
      
      * Slight refactor, add test, adress PR comments
      
      * fix formatting
      efa2561e
    • Jayaram Bobba's avatar
      Switch to using more expressive layout descriptors instead of numeric layout names (#1278) · 69c51c27
      Jayaram Bobba authored
      * Switch to using mkldnn memory descriptors for layout
      
      * More changes for using mkldnn descriptor instead of format
      
      * Removed mkldnn format from cpu layout descriptor. TODO - shuffle folding
      
      * Rotate mkldnn layouts on transpose
      
      * Modifications to builder reshape to skip rotated layouts
      
      * More fixes to layouts and removes axis order from cpu layout descriptor
      
      * Code cleanup
      
      * Removed shuffle folding pass since the functionality is subsumed by the layout pass
      
      * Canonicalize a few more formats to keep MKLDNN happy.
      
      * Style fixes
      
      * Style fixes
      
      * Style fixes
      
      * Addressed PR feedback and added reshape passthrough for non-transpose cases
      
      * Adjust named formats for weights tensors to keep MKLDNN happy
      
      * Style fixes
      
      * resolved merge issues
      69c51c27
  2. 03 Aug, 2018 2 commits
    • Nick Korovaiko's avatar
      11b992a7
    • Chris Sullivan's avatar
      Preallocate intermediate buffers (#1231) · 0599a628
      Chris Sullivan authored
      * Utilize GPUMemoryManager/Allocator for preallocation of intermediate tensor buffer memory.
      
      * Formatting.
      
      * Merge with master required rework of memory due to CFE pass. Moved function memory pool allocation to pass as a result.
      
      * Formatting.
      
      * Added pass source files.
      
      * Updated tests to account for new assert check. All GPUAllocators should be deconstructed before allocation is made in GPUMemoryManager.
      
      * GPUAllocator::close() can be used to close the allocator prior to destruction
      
      * Removed open allocators. Replaced check with inspection of pass::MemoryManager node list.
      
      * Formatting.
      
      * Rename m_memory_buffers -> m_tensor_memory_buffers. Use full path to static alignment variable.
      
      * FunctionMemoryReservation -> TensorMemoryReservation. Only return true in pass if reservation is made (bug fix).
      
      * Moved static compilation mutex.
      
      * Update external function with new pass name.
      
      * GPU_ExternalFunction: Add s_memory_pool_alignment, remove optimize_and_assemble method.
      0599a628
  3. 02 Aug, 2018 3 commits
  4. 27 Jul, 2018 3 commits
    • Nick Korovaiko's avatar
      is_contained (#1257) · 81c48453
      Nick Korovaiko authored
      81c48453
    • Nick Korovaiko's avatar
      CSE constant (#1271) · 953c65f8
      Nick Korovaiko authored
      953c65f8
    • Adam Procter's avatar
      Add some convenience macros/classes for error messages (#1258) · deacf29a
      Adam Procter authored
      * Testing out some ideas for better error messages on AvgPool
      
      * Add uncaught_exception() check to ConstructionAssertLogger dtor
      
      * More general assertion class, not homed inside Node
      
      * Minor formatting change
      
      * NODE_ASSERT for type prop failure
      
      * Produce lighter-weight DummyAssertionHandler when assertion succeeds
      
      * New ctor for AssertionHelper that takes a single location arg; more const&-ness for the constructors
      
      * Remove move constructor for AssertionHelper; fix broken test in assertion.cpp
      
      * Miscellaneous improvements
      
      * Templatized AssertionHelper so different exception classes can be used; implemented TYPE_CHECK_ASSERT around this
      * Changed from a "stack" of locations to a single location (the stack was too complicated)
      * Added "FAIL" classes/macros which do not take a condition
      
      * Rename a helper function
      
      * Cleanup, cruft removal
      
      * Add test to make sure the assert helper has the lifetime we expect
      
      * Missing includes
      deacf29a
  5. 26 Jul, 2018 1 commit
  6. 18 Jul, 2018 3 commits
  7. 17 Jul, 2018 1 commit
    • Jayaram Bobba's avatar
      Added more convolution variants to DEX (#1223) · 9bb0b653
      Jayaram Bobba authored
      * CPU Direct Execution: Implement ConvertLayout and refactor
      
      * CPU Direct Execution: Implement Convolution
      
      * 1) Adds computation reuse to direct execution
      2) Add avg_pool, broadcast and convolution_bias to direct execution
      3) Moved some computation reuse utility functions to graph_utils
      
      * Use lists instead of vectors to avoid reallocation overheads
      
      * - Added convolution variants to direct execution
      - Removed ConvolutionBiasRelu, use ConvolutionBias instead
      - Reduced code duplication by moving functionality to mkldnn_emitter
        from cpu_emitter
      
      * Style fix
      
      * Moved mkldnn build_convolution to a templated method
      
      * Style fix
      
      * refactored mkldnn conv bprop builders
      
      * Style fix
      9bb0b653
  8. 14 Jul, 2018 1 commit
  9. 13 Jul, 2018 1 commit
  10. 12 Jul, 2018 2 commits
  11. 11 Jul, 2018 1 commit
  12. 09 Jul, 2018 2 commits
  13. 07 Jul, 2018 1 commit
  14. 06 Jul, 2018 2 commits
  15. 03 Jul, 2018 2 commits
  16. 02 Jul, 2018 3 commits
    • Sandeep's avatar
      move sigmoid to core fusion (#1132) · d05b5e39
      Sandeep authored
      * declare sigmoid for core fusion
      
      * add simple test for sigmoid
      
      * info fusion status
      
      * cp op as main op
      
      * builds as expected
      
      * move sigmoid fusion code
      
      * add reference kernel
      
      * sigmoid bprop reference kernel and clang-format
      
      * add delta to bprop
      
      * fprop called
      
      * compiles bprop
      
      * move tests
      
      * serializer support
      
      * address comments in code
      
      * add doc
      
      * naming similar to core ops
      
      * fix failing test
      
      * fix failing test
      
      * address clang issue
      
      * more changes
      
      * change test macro
      d05b5e39
    • Pruthvi's avatar
      MKLDNN BoundedRelu implementation for Relu6 (#1179) · eaa6091c
      Pruthvi authored
      * 1. Added MKLDNNN BoundedRelu op support for Relu6
      2. CpuLayout && CPU assignment pass for BoundedRelu Op
      3. Unit test inter v/s CPU for BoundedReluOp
      4. MKLDNN and default emitter code for BoundedReluOp
      
      * Removed Debug prints
      
      * 1. Added support for boundedrelu to work on any constant literal
      2. unit test case for rank2, rank3, rank4 for bounded relu without serialized graph
      
      * Removed is_six() method
      eaa6091c
    • Louis Feng's avatar
      Conv+bias shape check for better error detection (#1176) · e42e5815
      Louis Feng authored
      * Reshape bias to 1D for conv + bias bprop fusion
      
      * Reshape goe2 back to 2D before replacing
      
      * added shape checks to validate conv+bias op.
      
      * removed conv+bias backprop merge for separate PR review.
      
      * fixed conv_bias_bprop test.
      
      * minor changes to error messages.
      e42e5815
  17. 30 Jun, 2018 2 commits
    • Pruthvi's avatar
      Pruthvi/fix rnn output (#1135) · c4c24cb0
      Pruthvi authored
      * - Fixed replace output for the multi layer recurrent cell state tensor output
      - Modified rnn add_output to consider direction and n_layer while calculating the output size for mkldnn dst_layer and dst_iter
      
      * fix unit test failure
      c4c24cb0
    • Nick Korovaiko's avatar
      LoopKernel Collector (#1128) · 784735d6
      Nick Korovaiko authored
      * collector
      
      * keeping track of inputs; simplifying a merging stratey; adding LKGraph
      
      * LoopKernel Collector
      
      * address feedback
      
      * address feedback 2
      
      * address feedback 3
      784735d6
  18. 28 Jun, 2018 2 commits
  19. 26 Jun, 2018 3 commits
    • Robert Kimball's avatar
      remove unused file (#1159) · e4db82ec
      Robert Kimball authored
      e4db82ec
    • Jayaram Bobba's avatar
      Convolution sum fusion (#1146) · 82ee0a77
      Jayaram Bobba authored
      * inplace compute
      
      * fix warnings
      
      * Initial support for convolution sum fusion
      
      * Added in-place support for conv sum fusion and test cases
      
      * reverting spurious changes
      
      * Bug fix to account for inplace input in conv sum fusion
      
      * fix compilation error
      
      * Addressed PR feedback
      82ee0a77
    • Igor Kaplounenko's avatar
      OS X support (#1098) · 5395a378
      Igor Kaplounenko authored
      * updated to work with llvm 8.1 that tensorflow is built with
      
      * sane extensions on the mac
      
      * not doing rpath on apple
      
      * apply style
      5395a378
  20. 25 Jun, 2018 2 commits
    • Nick Korovaiko's avatar
      inplace compute (#1141) · 88aa9e9c
      Nick Korovaiko authored
      * inplace compute
      
      * fix warnings
      
      * address bob's feedback
      
      * bob's feedback 2
      
      * bobs feedback 3
      
      * address bob's feedback 4
      88aa9e9c
    • Robert Kimball's avatar
      Fix build for MacOS (#1112) · e2e814e3
      Robert Kimball authored
      * remove reference to ngraph core code from codegen. add stand-alone implementations of needed funcions
      
      * fixed potential pointer leak
      
      * clean up file_util
      
      * more file util cleanup, removing unused functions
      
      * interpreter works on mac
      
      * CPU and INTERPRETER build and pass unmit tests on macos
      
      * move get_directory to file_util
      
      * cleanup
      e2e814e3
  21. 22 Jun, 2018 1 commit