• Chris Sullivan's avatar
    Cache and use fprop stats in cudnn batchnorm bprop (#1841) · fbc3a940
    Chris Sullivan authored
    * Temp bn update commit.
    
    * Add CUDNNBatchNorm which adds two additional outputs to batchnorm, the batch mean and batch inv variance.
    The batch mean is the same as the output mean if the cummulative average factor is 1.0. Add BatchNormCache pass which replaces all BatchNorm ops that are inputs to BatchNormBackprop
    with CUDNNBatchNorm which outputs the saved batch statistics directly to the backprop step.
    
    * Updated bn cache pass, removed extra tests, added test checking that provided stats are used in bprop instead of batch stats.
    This test was disabled for interpreter as the reference kernel needs to be updated to use provided statistics.
    
    * Formatting.
    
    * Update to new batch norm API.
    
    * CUDNNBatchNorm -> BatchNormTrainingWithStats
    
    * new line
    
    * Preprocess input variance into BN denominator for cudnn (#1885)
    
    * Add explicit cuda kernel to calculate what cuDNN describes as the inverse
    variance. In reality, the backward cudnn kernel for BN requires 1.0f / sqrt(variance + eps),
    which is the batchnorm denominator for each channel (a numerically stable inverse stddev).
    
    This introduces op annotations for batch norm backprop and updates the cudnn_emitter to support the insertion of this cuda kernel when required.
    
    * Disable second test on INTERPRETER.
    fbc3a940
backend_test.in.cpp 243 KB