    Add CV_16UC1 support for cuda::CLAHE · fb8e652c
    Namgoo Lee authored
    Due to size limit of shared memory, histogram is built on
    the global memory for CV_16UC1 case.
    The amount of memory needed for building histogram is:
        65536 * 4byte = 256KB
    and shared memory limit is 48KB typically.
    Added test cases for CV_16UC1 and various clip limits.
    Added perf tests for CV_16UC1 on both CPU and CUDA code.
    There was also a bug in CV_8UC1 case when redistributing
    "residual" clipped pixels. Adding the test case where clip
    limit is 5.0 exposes this bug.
