Commits · 5ece6de273add1fd902b7f944b83f6b4ce400af7 · submodule / ngraph

30 Jul, 2019 3 commits

ConstantFolding for Not (#3326) · 5ece6de2

Adam Procter authored Jul 30, 2019

* CF for Sign, and extend element type capabilities for unary arithop CF

* CF for Ceiling and Floor

* Update CPU CF builders

* Update CPU CF builders

* CF for Not

* Add tests for new CPU CF folders

* Add tests for recently added CPU CF functors

* Add tests for non-CPU ceiling/floor CF

* Unit tests

* Add test for CPU folder

5ece6de2

ConstantFolding for Ceiling and Floor (#3320) · 09952c0b

Adam Procter authored Jul 30, 2019

* CF for Sign, and extend element type capabilities for unary arithop CF

* CF for Ceiling and Floor

* Update CPU CF builders

* Update CPU CF builders

* Add tests for new CPU CF folders

* Add tests for recently added CPU CF functors

* Add tests for non-CPU ceiling/floor CF

* Unit tests

09952c0b

Add tests for recently added CPU CF functors (#3328) · 831df41d
Adam Procter authored Jul 30, 2019
```
* Add tests for new CPU CF folders

* Add tests for recently added CPU CF functors
```
831df41d

29 Jul, 2019 11 commits

ConstantFolding for Equal, Greater, GreaterEq, Less, LessEq, NotEqual (#3322) · bcaf32c4

Adam Procter authored Jul 29, 2019

* CF for And and Or

* CF support for comparison ops

* Fix predicate for binary elementwise; add unit tests for non-arithmetic binops

* Update CPU CF builders

bcaf32c4

Make validation a pass and add it after every pass by default (#3296) · c693cb7e

Robert Kimball authored Jul 29, 2019

* Make validation a pass and add it after every pass by default

* cleanup

* update per review comments

* Switch plaid to new API for disabling  pass validation

* address review comment

c693cb7e

[MLIR] Enable affine dialect loop fusion (#3290) · aedd8c2e

Diego Caballero authored Jul 29, 2019

* [MLIR] Enable affine dialect loop fusion

Enable affine dialect loop fusion in nGraph pipeline. It also adds an
opt flag to enable/disable it when ngraph-opt is in place. Fusion seems
to work for simple cases. It wasn't able to fuse dot + add, though, at
least in my test case. One example that worked:

Input:
  %6 = alloc() : memref<2500x2500xf32>
  affine.for %i3 = 0 to 2500 {
    affine.for %i4 = 0 to 2500 {
      %7 = load %arg0[%i3, %i4] : memref<2500x2500xf32>
      %8 = load %0[%i3, %i4] : memref<2500x2500xf32>
      %9 = addf %8, %7 : f32
      store %9, %6[%i3, %i4] : memref<2500x2500xf32>
    }
  }
  %10 = alloc() : memref<2500x2500xf32>
  affine.for %i5 = 0 to 2500 {
    affine.for %i6 = 0 to 2500 {
      %11 = load %arg2[%i5, %i6] : memref<2500x2500xf32>
      %12 = load %0[%i5, %i6] : memref<2500x2500xf32>
      %13 = addf %12, %11 : f32
      store %13, %10[%i5, %i6] : memref<2500x2500xf32>
    }
  }
  %14 = alloc() : memref<2500x2500xf32>
  affine.for %i7 = 0 to 2500 {
    affine.for %i8 = 0 to 2500 {
      %15 = load %10[%i7, %i8] : memref<2500x2500xf32>
      %16 = load %6[%i7, %i8] : memref<2500x2500xf32>
      %17 = addf %16, %15 : f32
      store %17, %14[%i7, %i8] : memref<2500x2500xf32>
    }
  }

Output:
  %8 = alloc() : memref<2500x2500xf32>
  affine.for %i3 = 0 to 2500 {
    affine.for %i4 = 0 to 2500 {
      %9 = load %arg2[%i3, %i4] : memref<2500x2500xf32>
      %10 = load %2[%i3, %i4] : memref<2500x2500xf32>
      %11 = addf %10, %9 : f32
      %12 = affine.apply #map2(%i3, %i4, %i3, %i4)
      %13 = affine.apply #map3(%i3, %i4, %i3, %i4)
      store %11, %0[%12, %13] : memref<1x1xf32>
      %14 = load %arg0[%i3, %i4] : memref<2500x2500xf32>
      %15 = load %2[%i3, %i4] : memref<2500x2500xf32>
      %16 = addf %15, %14 : f32
      %17 = affine.apply #map2(%i3, %i4, %i3, %i4)
      %18 = affine.apply #map3(%i3, %i4, %i3, %i4)
      store %16, %1[%17, %18] : memref<1x1xf32>
      %19 = affine.apply #map2(%i3, %i4, %i3, %i4)
      %20 = affine.apply #map3(%i3, %i4, %i3, %i4)
      %21 = load %0[%19, %20] : memref<1x1xf32>
      %22 = affine.apply #map2(%i3, %i4, %i3, %i4)
      %23 = affine.apply #map3(%i3, %i4, %i3, %i4)
      %24 = load %1[%22, %23] : memref<1x1xf32>
      %25 = addf %24, %21 : f32
      store %25, %8[%i3, %i4] : memref<2500x2500xf32>
    }
  }

* Rename MLIR_LLVM_OPTIONS to NGRAPH_MLIR_OPTIONS

Something like this works now:
NGRAPH_MLIR_OPTIONS="--enable-affine-loop-fusion=false"

* Disable loop fusion by default and fix typo

aedd8c2e

Merge pull request #3325 from NervanaSystems/nmostafa/mergefix · 862aa5fe
Robert Kimball authored Jul 29, 2019
```
[MLIR] Fix bad merge on 2 MLIR changes
```
862aa5fe
Fix bad merge on 2 MLIR changes · c4dfca3b
nmostafa authored Jul 29, 2019

c4dfca3b
Merge pull request #3298 from NervanaSystems/nmostafa/recompile · fc9a7dea
Robert Kimball authored Jul 29, 2019
```
[MLIR] Re-compile sub-graph once on first invocation
```
fc9a7dea
Merge branch 'master' into nmostafa/recompile · 956e8b3a
Robert Kimball authored Jul 29, 2019

956e8b3a
Merge pull request #3324 from NervanaSystems/bob/unit-test · 5d3456e4
Robert Kimball authored Jul 29, 2019
```
Changes to get allow NNP to pass tests
```
5d3456e4
style · 9733630b
Robert Kimball authored Jul 29, 2019

9733630b
fix error · 6a0945ee
Robert Kimball authored Jul 29, 2019

6a0945ee
fix tolerance · 6292fa4d
Robert Kimball authored Jul 29, 2019

6292fa4d

28 Jul, 2019 4 commits
- Merge branch 'master' into bob/unit-test · 17bcff88
  Scott Cyphers authored Jul 28, 2019
  
  17bcff88
- ConstantFolding for Sign (#3319) · f4d44bbc
  Adam Procter authored Jul 28, 2019
```
* CF for Sign, and extend element type capabilities for unary arithop CF

* Update CPU CF builders
```
  f4d44bbc
- Change test to not use the create_tensor call which takes a memory buffer · c5a7e690
  Robert Kimball authored Jul 28, 2019
  
  c5a7e690
- unit test to use all_close_f · 5273c0f4
  Robert Kimball authored Jul 28, 2019
  
  5273c0f4
27 Jul, 2019 3 commits
- ConstantFolding for Sum (#3318) · 098c9118
  Adam Procter authored Jul 27, 2019
```
* CF for Sum

* style
```
  098c9118
- ConstantFolding for Concat (#3317) · 4e0e0f56
  Adam Procter authored Jul 27, 2019
```
* CF for Concat

* Switch from Nodes to Inputs/Outputs
```
  4e0e0f56
- Quantization conversion from nodes to outputs (#3316) · 34499001
  Scott Cyphers authored Jul 27, 2019
  
  34499001
26 Jul, 2019 10 commits
- Reshape sinking: fix issue with handling rank changing reshape. (#3314) · 8eb63379
  Sang Ik Lee authored Jul 26, 2019
  
  8eb63379
- Refactor optimize() back · a2b9c6b8
  nmostafa authored Jul 26, 2019
  
  a2b9c6b8
- Fixed double-buffering timing (#3309) · c04b5588
  gcwenger authored Jul 26, 2019
```
API is synchronous per thread and threads are coordinated so that
we know when we hit the last iteration everything is done.
Using join() to gate end of iterations was introducing too much
overhead to timing as verified via checking traces.
```
  c04b5588
- [MLIR] Add missing visitor for Relu in compiler.cpp (#3308) · f54e9159
  Diego Caballero authored Jul 26, 2019
  
  f54e9159
- Convert some more ops to use Output<Node> inputs (#3307) · 0ca40376
  Scott Cyphers authored Jul 26, 2019
```
* Convert some ops to use Output<Node> inputs

* Remove duplicate validation
```
  0ca40376
- Some GetOutputElement changes to help with Output<Node> (#3306) · 0c523507
  Scott Cyphers authored Jul 26, 2019
```
* Some GetOutputElement changes to help with Output<Node>

* Review comments
```
  0c523507
- [MLIR] Deallocate all temp tensors before return (#3289) · 1d5d2024
  Nagy Mostafa authored Jul 26, 2019
```
* Deallocate all temp tensors before return

* style-apply
```
  1d5d2024
- style-apply · 6a7e1f24
  nmostafa authored Jul 26, 2019
  
  6a7e1f24
- Review fixes · 75a5cb00
  nmostafa authored Jul 26, 2019
  
  75a5cb00
- remove unique_ptr from allocator static allocation (#3269) · c0edf94d
  Pruthvi authored Jul 26, 2019
```
* remove unique_ptr from allocator static allocation

* - for default allocator return address of the static object on stack instead of returning ptr to dynamically allocated object

* Revert "- for default allocator return address of the static object on stack instead of returning ptr to dynamically allocated object"

This reverts commit 126b269cb298e6eb8b99b1b9a90d5aa7cf8fb948.

* use raw pointer in set_host_memory_allocator

* fix build error
```
  c0edf94d
25 Jul, 2019 7 commits
- Finish moving unit tests into test/backend (#3293) · 818c0bcb
  Adam Procter authored Jul 25, 2019
```
* Finish moving things into test/backend

* Update CODEOWNERS for moved file

* Tweak GPU manifest for renamed test
```
  818c0bcb
- MLIR-disabled compilcation fix · f4954d57
  nmostafa authored Jul 25, 2019
  
  f4954d57
- [MLIR] Fix naming convention in MLIR files (#3292) · a095c587
  Diego Caballero authored Jul 25, 2019
```
* [MLIR] Fix naming convention in MLIR files

Add naming convention note per file to state which files should use
nGraph naming convention and which MLIR naming convention and align
naming convention in those files with such a note.

* Remove m-prefix
```
  a095c587
- Merge remote-tracking branch 'upstream/master' into nmostafa/recompile · 0354192c
  nmostafa authored Jul 25, 2019
  
  0354192c
- Merge pull request #3295 from NervanaSystems/tsocha/improve-cmake-grama · ddc261b9
  Robert Kimball authored Jul 25, 2019
```
[CMAKE] Change CODEGEN error on windows to be more grammatically correct
```
  ddc261b9
- Merge branch 'master' into tsocha/improve-cmake-grama · 6b34ae42
  Michał Karzyński authored Jul 25, 2019
  
  6b34ae42
- Fix version number parsing when latest branch is not vxx.yy.zz (#3300) · c1220108
  Adam Procter authored Jul 25, 2019
  
  c1220108
24 Jul, 2019 2 commits
- style-apply · b5ab8ca1
  nmostafa authored Jul 24, 2019
  
  b5ab8ca1
- Move CompiledKernel back to ngraph core. Add a CompiledKernel->MLIRCompiler map in RuntimeContext · c6e9747b
  nmostafa authored Jul 24, 2019
  
  c6e9747b