Commits · 5b93ce921f47aaef2cde261d4fcfd0087b06e6f7 · submodule / capnproto

24 May, 2019 1 commit
- Add support for url-safe base64 encoding · 4b06a24e
  Joe Lee authored 5 years ago
  
  4b06a24e
18 Aug, 2018 1 commit
- Work around GCC 8's new -Wclass-memaccess. · 7db342c0
  Kenton Varda authored 6 years ago
  
  7db342c0
05 Aug, 2018 1 commit
- Adjust fallthrough comments to satisfy GCC7's -Wimplicit-fallthrough. · 150da2b0
  Kenton Varda authored 6 years ago
```
If we were using C++17, we could use [[fallthrough]] instead... but we are not.
```
  150da2b0
03 Apr, 2018 1 commit

Ensure '%' signs get round-tripped in URL path, fragment, userinfo · 25df9749

Harris Hancock authored 6 years ago

Our query string encoding function (encodeWwwForm()) was already doing the right thing.

I changed the comment in encodeUriPath() to clarify that it's intended to implement a URL class which stores its path in percent-decoded form, not either/or. I was wrong before.

25df9749

29 Mar, 2018 1 commit

Implement URL fragment, path, and userinfo component encode functions · 084f5526

Harris Hancock authored 6 years ago

According to the WHATWG URL spec, each different component of a URL gets its very own percent encode set, which we've been doing wrong this whole time.

In terms of reserved characters, the fragment set is a subset of the path set, which is a subset of the userinfo set, which is a subset of RFC 2396's reserved set.

084f5526

22 Mar, 2018 1 commit
- Fix base64 encoding when char is unsigned. · ae9d676c
  Kenton Varda authored 6 years ago
```
Tested on qemu-aarch64.
```
  ae9d676c
05 Feb, 2018 1 commit

Implement application/x-www-form-urlencoded encode/decode functions · 4982c9e8

Harris Hancock authored 7 years ago

These are almost the same as {encode,decode}UriComponent, differing only in the set of characters they consider reserved, and their treatment of spaces.

I wasn't sure what to name them -- encodeWwwForm() seemed least bad.

For the encode side, I added a completely separate function -- it seemed like more trouble than it was worth trying to integrate the changes into encodeUriComponent(). For the decode side, I integrated the change (plus-to-space) into decodeBinaryUriComponent(), since that function is a bit longer, and the change was trivial.

4982c9e8

20 Jan, 2018 1 commit
- Fix build on platforms where char is unsigned. · e49e6e60
  Kenton Varda authored 7 years ago
  
  e49e6e60
11 Dec, 2017 2 commits

Support encoding to and from wchar_t arrays. · ff9c3321

Kenton Varda authored 7 years ago

Different platforms have different sizes for wchar_t. For example:

* Linux: 32-bit (originally intended as UCS-4, rarely used in practice)
* Windows: 16-bit (originally intended as UCS-2, but now probably treated as UTF-16)
* BeOS: 8-bit (strictly intended to be UTF-8)

For KJ purposes, we'll assume wchar_t arrays use the UTF encoding appropriate to their size, whatever that may be on the target platform.

This is mainly being added because the Win32 API uses wchar_t heavily.

ff9c3321

Extend Unicode encoders to support 'WTF-8'. · 5483d8f7

Kenton Varda authored 7 years ago

This allows arbitrary char16 arrays to round-trip through UTF-8 without losing information, even if the char16 arrays are not valid UTF-16.

This is necessary e.g. for filesystem manipulation on Windows, where filenames contain 16-bit characters but valid UTF-16 is not enforced.

Invalid UTF-16 represented in UTF-8 is affectionately known as WTF-8: http://simonsapin.github.io/wtf-8/

5483d8f7

04 Dec, 2017 2 commits

Remove unnecessary branch in base64 decoder · c137c9fd
Harris Hancock authored 7 years ago

c137c9fd

decodeBase64() reports errors required by HTML spec · f3e0ed22

Harris Hancock authored 7 years ago

This change modifies decodeBase64() to report errors as required by the WHATWG HTML spec's atob() JavaScript function. Notably, it reports errors for non-whitespace characters outside of the valid base64 character range ([+/0-9A-Za-z=]), and performs sanity checks on padding and input length.

I took care to keep the algorithm single-pass, and to support streaming via multiple calls of base64_decode_block(), though we don't currently expose that functionality.

f3e0ed22

14 Oct, 2017 2 commits

Don't write plainchar on entry in step a · c10572fe

Ed Catmur authored 7 years ago

for the same reason - if we're called on an empty input, the output might not be a writeable pointer.
Results in memory corruption and crash in delete on MSVC.

c10572fe

Don't read past the end of the decode out buffer. · c2fbfc70

Edward Catmur authored 7 years ago

If we finish decoding in step_a state, there is no current output character, so reading *plainchar will either be an uninitialized read or (if the output buffer is minimally sized) a past-the-end read.

Detected by -fsanitize=address.

c2fbfc70

12 Oct, 2017 2 commits

Revert "Don't read past the end of the base64 decode out buffer." · 5e41df4b
Kenton Varda authored 7 years ago

5e41df4b

Don't read past the end of the decode out buffer. · 0771d33b

Edward Catmur authored 7 years ago

If we finish decoding in step_a state, there is no current output character, so reading *plainchar will either be an uninitialized read or (if the output buffer is minimally sized) a past-the-end read.

Detected by -fsanitize=address.

0771d33b

30 May, 2017 1 commit
- The URI standard says to prefer upper-case hex for percent encoding. · 18e3c9f1
  Kenton Varda authored 7 years ago
  
  18e3c9f1
23 May, 2017 2 commits

Improve KJ encoding lib error handling: · df52bf86

Kenton Varda authored 7 years ago

- Rename UtfResult -> EncodingResult
- Make it usable like a Maybe, so that we don't need separate "try" functions.
- Check errors in hex decoding and URI decoding.

df52bf86

Add CEscape to encodings. · 03800dfa
Kenton Varda authored 7 years ago

03800dfa

22 May, 2017 1 commit
- Add KJ utility functions to encode/decode blobs in common formats. · f74555b4
  Kenton Varda authored 7 years ago
```
In particular: UTF-{8,16,32}, Hex, URI encoding, and Base64
```
  f74555b4