Commit 1590c336 authored by Kenton Varda's avatar Kenton Varda

More docs.

parent 871d90c6
......@@ -69,6 +69,7 @@
</script>
<section id="main_content" class="inner">
{{ content }}
<div style="clear: left;"></div>
</section>
</div>
......
---
layout: page
---
# Encoding Spec
## NOT FINALIZED
The Cap'n Proto encoding is still evolving. In fact, as of this writing, the format described
by this spec is newer than what is actually implemented.
## 64-bit Words
For the purpose of Cap'n Proto, a "word" is defined as 8 bytes, or 64 bits. Since alignment of
data is important, all objects are aligned to word boundaries, and sizes are usually expressed in
terms of words.
## Messages
The unit of communication in Cap'n Proto is a "message". A message is a tree of objects, with
the root always being a struct.
Physically, messages may be split into several "segments", each of which is a flat blob of bytes.
Typically, a segment must be loaded into a contiguous block of memory before it can be accessed,
so that the relative pointers within the segment can be followed quickly. However, when a message
has multiple segments, it does not matter where those segments are located in memory relative to
each other; inter-segment pointers are encoded differently, as we'll see later.
Ideally, every message would have only one segment. However, there are a few reasons why splitting
a message into multiple segments may be convenient:
* It can be difficult to predict how large a message might be until you start writing it, and you
can't start writing it until you have a segment to write to. If it turns out the segment you
allocated isn't big enough, you can allocate additional segments without the need to relocate the
data you've already written.
* Allocating excessively large blocks of memory can make life difficult for memory allocators,
especially on 32-bit systems with limited address space.
The first word of the first segment of the message is always a pointer pointing to the message's
root struct.
Note that users of Cap'n Proto never need to understand segments; this is all taken care of
automatically by the runtime library.
## Built-in Types
The built-in primitive types are encoded as follows:
* `Void`: Not encoded at all. It has only one possible value thus carries no information.
* `Bool`: One bit. 1 = true, 0 = false.
* Integers: Encoded in little-endian format. Signed integers use two's complement.
* Floating-points: Encoded in little-endian IEEE-754 format.
Primitive types must always be aligned to a multiple of their size. Note that since the size of
a `Bool` is one bit, this means eight `Bool` values can be encoded in a single byte -- this differs
from C++, where the `bool` type takes a whole byte.
The built-in blob types are encoded as follows:
* `Data`: Encoded as a pointer, identical to `List(UInt8)`.
* `Text`: Like `Data`, but the content must be valid UTF-8, the last byte of the content must be
zero, and no other byte of the content can be zero.
## Enums
Enums are encoded the same as 16-bit integers.
## Lists
A list value is encoded as a pointer to a flat array of values.
lsb list pointer msb
+-+-----------------------------+--+----------------------------+
|A| B |C | D |
+-+-----------------------------+--+----------------------------+
A (2 bits) = 01, to indicate that this is a list pointer.
B (30 bits) = Offset, in words, from the start of the pointer to the
start of the list. Signed.
C (3 bits) = Size of each element:
0 = 0 (e.g. List(Void))
1 = 1 bit
2 = 1 byte
3 = 2 bytes
4 = 4 bytes
5 = 8 bytes (non-pointer)
6 = 8 bytes (pointer)
7 = composite (see below)
D (29 bits) = Number of elements in the list, except when C is 7
(see below).
The pointed-to values are tightly-packed. In particular, `Bool`s are packed bit-by-bit in
little-endian order (the first bit is the least-significant bit of the first byte).
When C = 7, the elements of the list are fixed-width composite values -- usually, structs. In
this case, the list content is prefixed by a "tag" word that describes each individual element.
The tag has the same layout as a struct pointer, except that the pointer offset (B) instead
indicates the number of elements in the list. Meanwhile, section (D) of the list pointer -- which
normally would store this element count -- instead stores the total number of _words_ in the list
(not counting the tag word). The reason we store a word count in the pointer rather than an element
count is to ensure that the extents of the list's location can always be determined by inspecting
the pointer alone, without having to look at the tag; this may allow more-efficient prefetching in
some use cases. The reason we don't store struct lists as a list of pointers is because doing so
would take significantly more space (an extra pointer per element) and may be less cache-friendly.
In the future, we could consider implementing matrixes using the "composite" element type, with the
elements being fixed-size lists rather than structs. In this case, the tag would look like a list
pointer rather than a struct pointer. As of this writing, no such feature has been implemented.
## Structs
A struct value is encoded as a pointer to its content. The content is split into two sections:
data and pointers, with the pointer section appearing immediately after the data section. This
split allows structs to be traversed (e.g., copied) without knowing their type.
A struct pointer looks like this:
lsb struct pointer msb
+-+-----------------------------+---------------+---------------+
|A| B | C | D |
+-+-----------------------------+---------------+---------------+
A (2 bits) = 00, to indicate that this is a struct pointer.
B (30 bits) = Offset, in words, from the start of the pointer to the
start of the struct's data section. Signed.
C (16 bits) = Size of the struct's data section, in words.
D (16 bits) = Size of the struct's pointer section, in words.
### Field Positioning
Ignoring unions, the layout of fields within the struct is determined by the following algorithm:
For each field of the struct, ordered by field number {
If the field is a pointer {
Add it to the end of the pointer section.
} else if the data section layout so far includes padding large
enough and properly-aligned to hold this field {
Replace the padding space with the new field, preferring to
put the field as close to the beginning of the section as
possible.
} else {
Add one word to the end of the data section.
Place the new field at the beginning of the new word.
Mark the rest of the new word as padding.
}
}
Keep in mind that `Bool` fields are bit-aligned, so multiple booleans will be packed into a
single byte. As always, little-endian ordering is the standard -- the first boolean will be
located at the least-significant bit of its byte.
When unions are present, add the following logic:
For each field and union of the struct, ordered by field number {
If this is a union, not a field {
Treat it like a 16-bit field, representing the union tag.
(See no-union logic, above.)
} else if this field is a member of a union {
If an earlier member of the union is in the same section as
this field and it combined with any following padding
is at least as large as the new field {
Give the new field the same offset, so they overlap.
} else {
Assign a new offset to this field as if it were not a union
member at all. (See no-union logic, above.)
}
} else {
Treat it as a regular field. (See no-union logic, above.)
}
}
Note that in the worst case, the members of a union could end up using 23 bytes plus one bit (one
pointer plus data section locations of 64, 32, 16, 8, and 1 bits). This is an unfortunate side
effect of the desire to pack fields in the smallest space where they will fit and the need to
maintain backwards-compatibility as fields are added. The worst case should be rare in practice.
### Default Values
A default struct is always all-zeros. To achieve this, fields in the data section are stored xor'd
with their defined default values. An all-zero pointer is considered "null" (since otherwise it
would point at itself, which makes no sense); accessor methods for pointer fields check for null
and return a pointer to their default value in this case.
There are several reasons why this is desirable:
* Cap'n Proto messages are often "packed" with a simple compression algorithm that deflates
zero-value bytes.
* Newly-allocated structs only need to be zero-initialized, which is fast and requires no knowledge
of the struct type except its size.
* If a newly-added field is placed in space that was previously padding, messages written by old
binaries that do not know about this field will still have its default value set correctly --
because it is always zero.
## Inter-Segment Pointers
When a pointer needs to point to a different segment, offsets no longer work. We instead encode
the pointer as a "far pointer", which looks like this:
lsb far pointer msb
+-+-----------------------------+-------------------------------+
|A| B | C |
+-+-----------------------------+-------------------------------+
A (2 bits) = 02, to indicate that this is a far pointer.
B (30 bits) = Offset, in words, from the start of the target segment
to the location of the far-pointer landing-pad within that
segment.
C (32 bits) = ID of the target segment. (Segments are numbered
sequentially starting from zero.)
The "landing pad" of a far pointer is normally just another pointer, which in turn points to the
actual object.
However, if the "landing pad" pointer is itself another far pointer, then it is interpreted
differently: This far pointer points to the start of the object's _content_, located in some other
segment. The landing pad is itself immediately followed by a tag word. The tag word looks exactly
like an intra-segment pointer to the target object would look, except that the offset is always
zero.
The reason for the convoluted double-far convention is to make it possible to form a new pointer
to an object in a segment that is full. If you can't allocate even one word in the segment where
the target resides, then you will need to allocate a landing pad in some other segment, and use
this double-far approach. This should be exceedingly rare in practice since pointers are normally
set to point to _new_ objects.
......@@ -24,7 +24,7 @@ embedded as pointers. Pointers are offset-based rather than absolute so that mes
position-independent. Integers use little-endian byte order because most CPUs are little-endian,
and even big-endian CPUs usually have instructions for reading little-endian data.
**_Doesn't that back backwards-compatibility hard?_**
**_Doesn't that make backwards-compatibility hard?_**
Not at all! New fields are always added to the end of a struct (or replace padding space), so
existing field positions are unchanged. The recipient simply needs to do a bounds check when
......@@ -34,7 +34,7 @@ always knows how to arrange them for backwards-compatibility.
**_Won't fixed-width integers, unset optional fields, and padding waste space on the wire?_**
Yes. However, since all these extra bytes are zeros, when bandwidth matters, we can apply an
extremely fast compression scheme to remove them. Cap'n Proto calls this "packing"; the message,
extremely fast compression scheme to remove them. Cap'n Proto calls this "packing" the message;
it achieves similar (better, even) message sizes to protobuf encoding, and it's still faster.
When bandwidth really matters, you should apply general-purpose compression, like
......@@ -59,10 +59,10 @@ Glad you asked!
process can be just as fast and easy as calling another thread.
* **Arena allocation:** Manipulating Protobuf objects tends to be bogged down by memory
allocation, unless you are very careful about object reuse. Cap'n Proto objects are always
allocated in an "arena"; or "region"; style, which is faster and promotes cache locality.
allocated in an "arena" or "region" style, which is faster and promotes cache locality.
* **Tiny generated code:** Protobuf generates dedicated parsing and serialization code for every
message type, and this code tends to be enormous. Cap'n Proto generated code is smaller by an
order of magnitude or more.
order of magnitude or more. In fact, usually it's no more than some inline accessor methods!
* **Tiny runtime library:** Due to the simplicity of the Cap'n Proto format, the runtime library
can be much smaller.
......@@ -73,16 +73,4 @@ version 2, which is the version that Google released open source. Cap'n Proto is
years of experience working on Protobufs, listening to user feedback, and thinking about how
things could be done better.
I am no longer employed by Google. Cap'n Proto is not affiliated with Google or any other company.
**_Tell me about the RPC system._**
_As of this writing, the RPC system is not yet implemented._
Cap'n Proto defines a [capability-based](http://en.wikipedia.org/wiki/Capability-based_security)
RPC protocol. In such a system, any message passed over the wire can itself contain references to
callable objects. Passing such a reference over the wire implies granting the recipient permission
to call the referenced object -- until a reference is sent, the recipient has no way of addressing
it in order to form a request to it, or even knowing that it exists.
Such a system makes it very easy to define stateful, secure object-oriented protocols.
I no longer work for Google. Cap'n Proto is not affiliated with Google or any other company.
......@@ -15,8 +15,9 @@ many essential features:
* **Stability:** The Cap'n Proto format is still changing. Any data written today probably won't
be understood by future versions. Additionally, the programming interface is still evolving, so
code written today probably won't work with future versions.
* **Performance:** While already beating the pants off other systems, Cap'n Proto has not yet
undergone serious profiling and optimization.
* **Performance:** While Cap'n Proto is inherently fast by design, the implementation has not yet
undergone serious profiling and optimization. Currenlty it only beats Protobufs in realistic-ish
end-to-end benchmarks by, like, 2x-5x. We can do better.
* **RPC:** The RPC protocol has not yet been specified, much less implemented.
* **Support for languages other than C++:** Hasn't been started yet.
......@@ -56,8 +57,8 @@ code without instructions. It also supports continuous builds, where it watches
changes (via inotify) and immediately rebuilds as necessary. Instant feedback is key to
productivity, so I really like using Ekam.
Unfortunately it's very much unfinished. It works (for me), but it is very quirky. It only works
on Linux, and is best used together with Eclipse.
Unfortunately it's very much unfinished. It works (for me), but it is quirky and rough around the
edges. It only works on Linux, and is best used together with Eclipse.
The Cap'n Proto repo includes a script which will attempt to set up Ekam for you.
......@@ -65,8 +66,8 @@ The Cap'n Proto repo includes a script which will attempt to set up Ekam for you
cd capnproto/c++
./setup-ekam.sh
If all goes well, this downloads the Ekam code into `.ekam` and adds some symlinks under src.
It also imports the [Google Test](https://googletest.googlecode.com) and
If all goes well, this downloads the Ekam code into a directory called `.ekam` and adds some
symlinks under src. It also imports the [Google Test](https://googletest.googlecode.com) and
[Protobuf](http://protobuf.googlecode.com) source code, so you can compile tests and benchmarks.
Once Ekam is installed, you can do:
......
---
layout: page
---
# Other Languages
Currently, Cap'n Proto is implemented only in C++. We'd like to support many more languages in
the future!
If you'd like to own the implementation of Cap'n Proto in some particular language,
[let us know](https://groups.google.com/group/capnproto)!
---
layout: page
---
# RPC Protocol
The Cap'n Proto RPC protocol is not yet defined. See the language spec's
[section on interfaces](language.html#interfaces) for a hint of what it will do.
Here are some misc planned / hoped-for features:
* **Shared memory IPC:** When instructed to communicate over a Unix domain socket, Cap'n Proto may
automatically negotiate to use shared memory, by creating a temporary file and then sending a
file descriptor across the socket. Once messages are being allocated in shared memory, RPCs
can be initiated by merely signaling a [futex](http://man7.org/linux/man-pages/man2/futex.2.html)
(on Linux, at least), which ought to be ridiculously fast.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment