• Chris Sullivan's avatar
    Auto. gen. kernel signatures and argument expansion (#1326) · 8476dea0
    Chris Sullivan authored
    * Add GPUKernelArgs for storing kernel arguments.
    
    * Formatting.
    
    * Resolve tensor addresses when extracting arg list via GPUKernelArgs.
    
    * Updated arg list resolution so that placeholder arguments can be added anywhere in the argument list.
    
    * const ref. args and changed add_args to use add_arg. also expanded type_names map.
    
    * GPUKernelArgs bug fix for return values.
    
    * add_placeholders expects pointers for later resolution
    
    * Formatting.
    
    * Add comments to GPUKernelArgs
    
    * Changed GPUKernelArgs interface to use a runtime variable number of arguments.
    
    * Removed/updated comment.
    
    * Address review comments: Remove combined address resolution and argument list retrieval. Remove unecessary extra type entries in type_map.
    
    * Add space between pragma once and includes.
    
    * Broadcast optimization (#1322)
    
    * Implement GPUKernelArgs with op::Broadcast.
    
    * Removed excess type insertion in kernel signature for broadcast impl.
    
    * Support new auto kernel signature generation for op::Broadcast. Add boolean to helpers to determine if parameters are registers or arrays.
    
    * Removed commented code.
    
    * Update broadcast impl. for new GPUKernelArgs interface.
    
    * Updated based on interface change to GPUKernelArgs.
    
    * Formatting.
    
    * CUDNNHostParameters now implement GPUHostParameters. (#1324)
    
    * Formatting.
    8476dea0
gpu_cuda_kernel_builder.cpp 54.6 KB