• Chris Sullivan's avatar
    Add reduce sum to the GPU transformer (op::Sum) (#671) · bae77590
    Chris Sullivan authored
    * Current cudnn implementations use only
    a single dimension for the ngraph tensor data (width).
    In this case the tensor format should be set to
    
    CUDNN_TENSOR_NCHW
    
    so that adjacent memory accesses are coalesced (stride=1 for width).
    
    * * Added some kernel emitter helpers that are reused often.
    * Renamed EmitElementwise -> emit_elementwise to match emit<T>.
    * op::Sum now handles trivial case of dim(input_tensor) = dim(output_tensor)
      by performing a memcpy as no axes are reduced.
    
    *   Added general case for Nd descriptors which is used when the tensor
      has more than 4 dimensions. Currently a naive reduce is performed,
      in the future a coordinate transformation could be performed to
      improve the memory layout for the reduction.
    
    * Switched to codegen::CodeWriter::block_begin/end.
    It appears that CodeWriter::block_begin/end is not frequently used for emitters (in cpu and gpu transformers)
    because a block comment is often desired. To this end I added prefix/suffix default parameters to CodeWriter::block_begin/end
    so that this functionality is captured.
    bae77590
Name
Last commit
Last update
..
files Loading commit data...
models Loading commit data...
ref_generators Loading commit data...
util Loading commit data...
CMakeLists.txt Loading commit data...
autodiff.in.cpp Loading commit data...
backend_debug_api.cpp Loading commit data...
backend_performance.cpp Loading commit data...
backend_test.in.cpp Loading commit data...
build_graph.cpp Loading commit data...
builder.cpp Loading commit data...
builder_autobroadcast.cpp Loading commit data...
builder_xla.cpp Loading commit data...
codegen.cpp Loading commit data...
convolution_test.in.cpp Loading commit data...
copy.cpp Loading commit data...
core_fusion.cpp Loading commit data...
cpio.cpp Loading commit data...
cpu_fusion.cpp Loading commit data...
cudnn.cpp Loading commit data...
distributed.cpp Loading commit data...
eigen.cpp Loading commit data...
element_type.cpp Loading commit data...
file_util.cpp Loading commit data...
graph_partition.cpp Loading commit data...
includes.cpp Loading commit data...
inliner.cpp Loading commit data...
input_output_assign.cpp Loading commit data...
main.cpp Loading commit data...
mkldnn.cpp Loading commit data...
ngraph.cpp Loading commit data...
op.cpp Loading commit data...
pass_liveness.cpp Loading commit data...
pass_manager.cpp Loading commit data...
pass_memory_layout.cpp Loading commit data...
pattern.cpp Loading commit data...
reshape_elimination.cpp Loading commit data...
runtime_manager.cpp Loading commit data...
serialize.cpp Loading commit data...
shape.cpp Loading commit data...
tensor.cpp Loading commit data...
type_prop.cpp Loading commit data...
update_reference.sh Loading commit data...
util.cpp Loading commit data...
uuid.cpp Loading commit data...