1. 02 Aug, 2017 1 commit
  2. 14 Jul, 2017 1 commit
    • Vadim Pisarevsky's avatar
      optimize out scaleLayer & concatLayer whenever possible · 0488d9bd
      Vadim Pisarevsky authored
      fixed problem in concat layer by disabling memory re-use in layers with multiple inputs
      
      trying to fix the tests when Halide is used to run deep nets
      
      another attempt to fix Halide tests
      
      see if the Halide tests will pass with concat layer fusion turned off
      
      trying to fix failures in halide tests; another try
      
      one more experiment to make halide_concat & halide_enet tests pass
      
      continue attempts to fix halide tests
      
      moving on
      
      uncomment parallel concat layer
      
      seemingly fixed failures in Halide tests and re-enabled concat layer fusion; thanks to dkurt for the patch
      0488d9bd
  3. 13 Jul, 2017 1 commit
  4. 02 Jul, 2017 1 commit
  5. 30 Jun, 2017 1 commit
  6. 28 Jun, 2017 2 commits
    • Alexander Alekhin's avatar
      dnn: added "hidden" experimental namespace · da096032
      Alexander Alekhin authored
      Main purpose of this namespace is to avoid using of incompatible
      binaries that will cause applications crashes.
      
      This additional namespace will not impact "Source code API".
      This change allows to maintain ABI checks (with easy filtering out).
      da096032
    • Vadim Pisarevsky's avatar
      another round of dnn optimization (#9011) · 8b3d6603
      Vadim Pisarevsky authored
      * another round of dnn optimization:
      * increased malloc alignment across OpenCV from 16 to 64 bytes to make it AVX2 and even AVX-512 friendly
      * improved SIMD optimization of pooling layer, optimized average pooling
      * cleaned up convolution layer implementation
      * made activation layer "attacheable" to all other layers, including fully connected and addition layer.
      * fixed bug in the fusion algorithm: "LayerData::consumers" should not be cleared, because it desctibes the topology.
      * greatly optimized permutation layer, which improved SSD performance
      * parallelized element-wise binary/ternary/... ops (sum, prod, max)
      
      * also, added missing copyrights to many of the layer implementation files
      
      * temporarily disabled (again) the check for intermediate blobs consistency; fixed warnings from various builders
      8b3d6603
  7. 27 Jun, 2017 1 commit
  8. 26 Jun, 2017 3 commits