Use aligned SSE register load intrinsic.
The code goes to the trouble of ensuring that data is aligned at a 16-byte boundary, then goes ahead and uses the unaligned form of the load intrinsic _mm_loadu_si128. Either the code shouldn't bother aligning the data to the start of the whitespace, or it should use the aligned form of the intrinsic.
Showing
Please
register
or
sign in
to comment