Skip to content
Projects
Groups
Snippets
Help
Loading...
Sign in / Register
Toggle navigation
C
capnproto
Project
Project
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Packages
Packages
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
submodule
capnproto
Commits
4f4a9c44
Commit
4f4a9c44
authored
Aug 12, 2018
by
Kenton Varda
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Document encoding of zero-size struct pointers. Fixes #625.
parent
70fcf7f5
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
38 additions
and
17 deletions
+38
-17
encoding.md
doc/encoding.md
+38
-17
No files found.
doc/encoding.md
View file @
4f4a9c44
...
...
@@ -119,6 +119,44 @@ Field offsets are computed by the Cap'n Proto compiler. The precise algorithm i
to describe here, but you need not implement it yourself, as the compiler can produce a compiled
schema format which includes offset information.
#### Default Values
A default struct is always all-zeros. To achieve this, fields in the data section are stored xor'd
with their defined default values. An all-zero pointer is considered "null"; accessor methods
for pointer fields check for null and return a pointer to their default value in this case.
There are several reasons why this is desirable:
*
Cap'n Proto messages are often "packed" with a simple compression algorithm that deflates
zero-value bytes.
*
Newly-allocated structs only need to be zero-initialized, which is fast and requires no knowledge
of the struct type except its size.
*
If a newly-added field is placed in space that was previously padding, messages written by old
binaries that do not know about this field will still have its default value set correctly --
because it is always zero.
#### Zero-sized structs.
As stated above, a pointer whose bits are all zero is considered a null pointer,
*not*
a struct of
zero size. To encode a struct of zero size, set A, C, and D to zero, and set B (the offset) to -1.
**Historical explanation:**
A null pointer is intended to be treated as equivalent to the field's
default value. Early on, it was thought that a zero-sized struct was a suitable synonym for
null, since interpreting an empty struct as any struct type results in a struct whose fields are
all default-valued. So, the pointer encoding was designed such that a zero-sized struct's pointer
would be all-zero, so that it could conveniently be overloaded to mean "null".
However, it turns out there are two important differences between a zero-sized struct and a null
pointer. First, applications often check for null explicitly when implementing optional fields.
Second, an empty struct is technically equivalent to the default value for the struct
*type*
,
whereas a null pointer is equivalent to the default value for the particular
*field*
. These are
not necessarily the same.
It therefore became necessary to find a different encoding for zero-sized structs. Since the
struct has zero size, the pointer's offset can validly point to any location so long as it is
in-bounds. Since an offset of -1 points to the beginning of the pointer itself, it is known to
be in-bounds. So, we use an offset of -1 when the struct has zero size.
### Lists
A list value is encoded as a pointer to a flat array of values.
...
...
@@ -174,23 +212,6 @@ unreasonable implementation burden.) Note that even though struct lists can be d
element size (except C = 1), it is NOT permitted to encode a struct list using any type other than
C = 7 because doing so would interfere with the
[
canonicalization algorithm
](
#canonicalization
)
.
#### Default Values
A default struct is always all-zeros. To achieve this, fields in the data section are stored xor'd
with their defined default values. An all-zero pointer is considered "null" (such a pointer would
otherwise point to a zero-size struct, which might as well be considered null); accessor methods
for pointer fields check for null and return a pointer to their default value in this case.
There are several reasons why this is desirable:
*
Cap'n Proto messages are often "packed" with a simple compression algorithm that deflates
zero-value bytes.
*
Newly-allocated structs only need to be zero-initialized, which is fast and requires no knowledge
of the struct type except its size.
*
If a newly-added field is placed in space that was previously padding, messages written by old
binaries that do not know about this field will still have its default value set correctly --
because it is always zero.
### Inter-Segment Pointers
When a pointer needs to point to a different segment, offsets no longer work. We instead encode
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment