• Fenglei's avatar
    nvgpu cuda reduce with stable sum (#2076) · 606f3f93
    Fenglei authored
    * add some helper function
    
    * update with new helper function
    
    * update reduce to nd with new helper function
    
    * update float sum to stable sum
    
    * fix bug
    
    * update all reduce to stable sum for float
    
    * fix bug and pass the sum stable test
    
    * remove debug info
    
    * style
    
    * update with shape
    
    * fix bug
    
    * add host parameters to cuda_emitter
    
    * clang format
    
    * fix bugs
    
    * add element::type support
    
    * format
    
    * add a cached value with datatype name
    
    * add init_reduce_value
    
    * unroll loop
    
    * optimization
    
    * remove the need for init_value
    
    * add memset kernel
    
    * add memcpy
    
    * working version
    
    * remove debug info
    
    * add comments, clean up code.
    
    * change in_idx to input_idx
    
    * fix bug
    
    * change args name for memset in emitter
    
    * pass element::Type instead of string
    
    * the op::reduce come with init value, add support
    
    * resolve codacy-bot comment
    
    * fix bug
    
    * resove codacy-bot comment
    
    * remove unused comments, resolve comments
    
    * cuda reduce for max, min, mul, reduce op init value, format
    
    * use type::info
    
    * use type info for numeric_limits
    
    * remove code from gpu_host_parameters
    
    * header
    
    * remvoe outdated comments
    
    * add helper to check if stable sum is needed
    
    * add stable sum test for double
    
    * remove extra line
    
    * consolidate helper functions
    
    * no need list now.
    
    * remove extra ;
    
    * clang format
    
    * style
    
    * add skip test for cpu and intelGPU side
    
    * add line between groups of headers
    
    * add two simple stable sum test for float and double
    
    * skip test for intelGPU
    606f3f93
Name
Last commit
Last update
..
files Loading commit data...
models Loading commit data...
ref_generators Loading commit data...
util Loading commit data...
CMakeLists.txt Loading commit data...
algebraic_simplification.cpp Loading commit data...
all_close_f.cpp Loading commit data...
assertion.cpp Loading commit data...
autodiff.in.cpp Loading commit data...
backend_api.cpp Loading commit data...
backend_arg_reduce.in.cpp Loading commit data...
backend_binary_elementwise.in.cpp Loading commit data...
backend_broadcast.in.cpp Loading commit data...
backend_comparison.in.cpp Loading commit data...
backend_debug_api.cpp Loading commit data...
backend_dot.in.cpp Loading commit data...
backend_graph_comparison.in.cpp Loading commit data...
backend_one_hot.in.cpp Loading commit data...
backend_performance.cpp Loading commit data...
backend_pool.in.cpp Loading commit data...
backend_reduce.in.cpp Loading commit data...
backend_reshape.in.cpp Loading commit data...
backend_sum.in.cpp Loading commit data...
backend_test.in.cpp Loading commit data...
backend_topk.in.cpp Loading commit data...
backend_unary_elementwise.in.cpp Loading commit data...
build_graph.cpp Loading commit data...
builder.cpp Loading commit data...
builder_autobroadcast.cpp Loading commit data...
builder_quantization.cpp Loading commit data...
constant_folding.cpp Loading commit data...
control_dependencies.cpp Loading commit data...
convolution_test.in.cpp Loading commit data...
coordinate.cpp Loading commit data...
copy.cpp Loading commit data...
core_fusion.cpp Loading commit data...
cpio.cpp Loading commit data...
cpu_debugger.cpp Loading commit data...
cpu_fusion.cpp Loading commit data...
cpu_reshape_sinking.cpp Loading commit data...
cpu_test.cpp Loading commit data...
cse.cpp Loading commit data...
cudnn.cpp Loading commit data...
distributed.in.cpp Loading commit data...
element_type.cpp Loading commit data...
file_util.cpp Loading commit data...
gpu_fusion.cpp Loading commit data...
gpu_test.cpp Loading commit data...
graph_partition.cpp Loading commit data...
halide.cpp Loading commit data...
hybrid_backend.cpp Loading commit data...
hybrid_utils.cpp Loading commit data...
hybrid_utils.hpp Loading commit data...
includes.cpp Loading commit data...
inliner.cpp Loading commit data...
input_output_assign.cpp Loading commit data...
main.cpp Loading commit data...
mkldnn.cpp Loading commit data...
nop_elimination.cpp Loading commit data...
onnx_import.cpp Loading commit data...
onnxifi.cpp Loading commit data...
onnxifi_span.cpp Loading commit data...
op.cpp Loading commit data...
partial_shape.cpp Loading commit data...
pass_liveness.cpp Loading commit data...
pass_manager.cpp Loading commit data...
pass_memory_layout.cpp Loading commit data...
pattern.cpp Loading commit data...
reshape_elimination.cpp Loading commit data...
serialize.cpp Loading commit data...
shape.cpp Loading commit data...
tensor.cpp Loading commit data...
type_prop.cpp Loading commit data...
update_reference.sh Loading commit data...
util.cpp Loading commit data...
uuid.cpp Loading commit data...
zero_dim_tensor_elimination.cpp Loading commit data...