Commit 53a814a0 authored by William A Rowe Jr's avatar William A Rowe Jr Committed by Adam Cozzette

Treat F5-FF octets as single (invalid) characters

This corresponds to the newest reading of RFC 3629, and results
in the largest possible number of character entities by any
valid parser. This may result in a buffer which is oversized,
but never undersized.

This is after further discussion with acozzette in this PR;
https://github.com/protocolbuffers/protobuf/pull/6844

Signed-off-by: William A Rowe Jr wrowe@pivotal.io
Signed-off-by: Yechiel Kalmenson ykalmenson@pivotal.io
parent 961c0e6b
......@@ -2292,7 +2292,7 @@ static const unsigned char kUTF8LenTbl[256] = {
1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,
2,2,2,2,2,2,2,2, 2,2,2,2,2,2,2,2, 2,2,2,2,2,2,2,2, 2,2,2,2,2,2,2,2,
3,3,3,3,3,3,3,3, 3,3,3,3,3,3,3,3, 4,4,4,4,4,4,4,4, 5,5,5,5,6,6,1,1
3,3,3,3,3,3,3,3, 3,3,3,3,3,3,3,3, 4,4,4,4,4,1,1,1, 1,1,1,1,1,1,1,1
};
// Return length of a single UTF-8 source character
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment