Commit c5552788 authored by Chuanbo Weng's avatar Chuanbo Weng

Use vload to read unaligned data instead of dereference operator.

According to opencl 1.2 spec 6.1.5:
    For arguments to a __kernel function declared to be a pointer to a
    data type, the OpenCL compiler can assume that the pointee is always
    appropriately aligned as required by the data type. The behavior of
    an unaligned load or store is undefined, except for the
    vloadn, vload_halfn, vstoren, and vstore_halfn functions defined in
    section 6.12.7.

Original code read data of type T from address not aligned by multiple
of sizeof(T), so the result is incorrect. With this patch, the cases
./opencv_perf_imgproc
--gtest_filter=OCL_ImgSize_TmplSize_Method_MatType_MatchTemplate.MatchTemplate/*
could work well with beignet 0.9.3.
Signed-off-by: 's avatarChuanbo Weng <chuanbo.weng@intel.com>
parent c445ce61
......@@ -161,7 +161,7 @@ __kernel void matchTemplate_Naive_CCORR(__global const uchar * srcptr, int src_s
for (int j = 0; j < template_cols; ++j)
{
T temp = (T)(template[j]);
T src = *(__global const T*)(srcptr + ind + j*(int)sizeof(T1));
T src = vload4(0, (__global const T1*)(srcptr + ind + j*(int)sizeof(T1)));
sum = mad(convertToWT(src), convertToWT(temp), sum);
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment