Yesterday, some engineers at Google released [FlatBuffers](http://google-opensource.blogspot.com/2014/06/flatbuffers-memory-efficient.html), a new serialization protocol and library with similar design principles to Cap'n Proto. Also, a few months back, Real Logic released [Simple Binary Encoding](http://mechanical-sympathy.blogspot.com/2014/05/simple-binary-encoding.html), another protocol and library of this nature.
It seems we now have some friendly rivalry. :)
It's great to see that the concept of `mmap()`-able, zero-copy serialization formats are catching on, and it's wonderful that all are open source under liberal licenses. But as a user, you might be left wondering how all these systems compare. You have a vague idea that all these encodings are "fast", particularly compared to Protobufs or other more-traditional formats. But there is more to a serialization protocol than speed, and you may be wondering what else you should be considering.
The goal of this blog post is to highlight some of the main _qualitative_ differences between these libraries as I see them. Obviously, I am biased, and you should consider that as you read. Hopefully, though, this provides a good starting point for your own investigation of the alternatives.
### Feature Matrix
The following are a set of considerations I think are important. See something I missed? Please [let me know](mailto:kenton@sandstorm.io) and I'll add it. I'd like in particular to invite the SBE and FlatBuffers authors to suggest advantages of their libraries that I may have missed.
I will go into more detail on each item below.
Note: For features which are properties of the implementation rather than the protocol or project, unless otherwise stated, I am judging the C++ implementations.
<tr><td>Authors' preferred use case</td><td>distributed<br>computing</td><td><ahref="https://sandstorm.io">platforms /<br>sandboxing</a></td><td>financial<br>trading</td><td>games</td></tr>
</table>
**Schema Evolution**
All four protocols allow you to add new fields to a schema over time, without breaking backwards-compatibility. New fields will be ignored by old binaries, and new binaries will fill in a default value when reading old data.
**Zero-copy**
The central thesis of all three competitors is that data should be structured the same way in-memory and on the wire, thus avoiding costly encode/decode steps.
Protobufs reperesents the old way of thinking.
**Random-access reads**
Can you traverse the message content in an arbitrary order? Relatedly, can you `mmap()` in a large (say, 2GB) file, then traverse to and read one particular field without causing the entire file to be paged in from disk?
Protobufs does not allow this because the entire file must be parsed upfront before any of the content can be used. Even with a streaming Protobuf parser (which most libraries don't provide), you would at least need to parse all data appearing before the bit you want.
SBE does not allow random access because the message tree is written in preorder with no information that would allow one to skip over an entire sub-tree. While the primitive fields within a single object can be accessed in random order, sub-objects must be traversed strictly in preorder. SBE apparently chose to design around this restriction because sequential memory access is faster than random access, therefore this forces application code to be ordered to be as fast as possible.
Cap'n Proto permits random access via the use of pointers, exactly as in-memory data structures in C normally do. These pointers are not quite native pointers -- they are relative rather than absolute, to allow the message to be loaded at an arbitrary memory location.
FlatBuffers permits random access by having each record store a table of offsets to all of the field positions.
**Safe against malicious input**
Protobufs is carefully designed to be resiliant in the face of all kinds of malicious input, and has undergone a security review by Google's world-class security team. Not only is the Protobuf implementation secure, but the API is explicitly designed to discourage security mistakes in application code. It is considered a security flaw in Protobufs if the interface makes client apps likely to write insecure code.
Cap'n Proto inherits Protocol Buffers' security stance, and is believed to be similarly secure. However, it has not yet undergone security review.
SBE's C++ library does bounds checking as of the resolution of [this bug](https://github.com/real-logic/simple-binary-encoding/issues/130).
FlatBuffers does no bounds checking. When reading a message, you start by giving the library a bare pointer to the start of the message, with no size. FlatBuffers appears to be intended for use as a format for static, trusted data files, not network messages.
**Reflection / generic algorithms**
Protobuf provides a "reflection" interface which allows dynamically iterating over all the fields of a message, getting their names and other metadata, and reading and modifying their values in a particular instance. Cap'n Proto also supports this, calling it the "Dynamic API". SBE and FlatBuffers provide no such API.
Having a reflection/dynamic API opens up a wide range of use cases. You can write reflection-based code which converts the message to/from another format such as JSON -- useful not just for interoperability, but for debugging, because it is human-readable. Another popular use of reflection is writing bindings for scripting languages. For example, Python's Cap'n Proto implementation is simply a wrapper around the C++ dynamic API. Note that you can do all these things with types that are not even known at compile time, by parsing the schemas at runtime.
The down side of reflection is that it is generally very slow (compared to generated code) and can lead to code bloat. Cap'n Proto is designed such that the reflection APIs need not be linked into your app if you do not use them, although this requires statically linking the library to get the benefit.
**Initialization order**
When building a message, depending on how your code is organized, it may be convenient to have flexibility in the order in which you fill in the data. If that flexibility is missing, you may find you have to do extra bookkeeping to store data off to the side until its time comes to be added to the message.
Protocol Buffers is natually completely flexible in terms of initialization order because the mesasge is being built on the heap. There is no reason to impose restrictions. (Although, the C++ Protobuf library heavily encourages top-down building.)
All the zero-copy systems, though, have to use some form of arena allocation to make sure that the message is built in a contiguous block of memory that can be written out all at once. So, things get more complicated.
SBE specifically requires the message tree to be written in preorder (though, as with reads, the primitive fields within a single object can be initialized in arbitrary order).
FlatBuffers requires that you completely finish one object before you can start building the next, because the size of an object depends on its content so the amount of space needed isn't known until it is finalized. This also implies that FlatBuffer messages must be built bottom-up, starting from the leaves.
Cap'n Proto imposes no ordering constraints. The size of an object is known when it is allocated, so more objects can be allocated immediately. Messages are normally built top-down, but bottom-up ordering is supported through the "orphans" API.
**Unknown field retention?**
Say you read in a message, then copy one sub-object of that message over to a sub-object of a new message, then write out the new message. Say that the copied object was created using a newer version of the schema than you have, and so contains fields you don't know about. Do those fields get copied over?
This question is extremely important for any kind of service that acts as a proxy or broker, forwarding messages on to others. It can be inconvenient if you have to update these middlemen every time a particular backend protocol changes, when the middlemen often don't care about the protocol details anyway.
When Protobufs sees an unknown field tag on the wire, it stores the value into the message's `UnknownFieldSet`, which can be copied and written back out later.
Cap'n Proto's wire format was very carefully designed to contain just enough information to make it possible to recursively copy its target from one message to another without knowing the object's schema. This is why Cap'n Proto pointers contain bits to indicate if they point to a struct or a list and how big it is -- seemingly redundant information.
SBE and FlatBuffers do not store any such type information on the wire, and thus it is not possible to copy an object without its schema.
**Object-capability RPC system**
Cap'n Proto features an object-capability RPC system. While this article is not intended to discuss RPC features, there is an important effect on the serialization format: in an object-capability RPC system, references to remote objects must be a first-class type. That is, a struct field's type can be "reference to remote object implementing RPC interface Foo".
Protobufs, SBC, and FlatBuffers do not support this type. Note that it is _not_ sufficient to simply store a string URL, or define some custom struct to represent a reference, because a proper capability-based RPC system must be aware of all references embedded in any message it sends. There are many reasons for this requirement, the most obvious of which is that the system must export the reference or change its permissions to make it available to the receiver.
**Schema language**
Protobufs, Cap'n Proto, and FlatBuffers have custom, concise schema languages.
SBE uses XML schemas, which are verbose.
**Usable as mutable state**
Protobuf generated classes have often been (ab)used as a convenient way to store an application's mutable internal state. There's mostly no problem with modifying a message gradually over time and then serializing it when needed.
This usage pattern does not work well with any zero-copy serialization format because these formats must use arena-style allocation to make sure the message is built in contiguous memory. Arena allocation has the property that you cannot free any object unless you free the entire arena. Therefore, when objects are discarded, the memory ends up leaked until the message as a whole is destoryed. A long-lived message that is modified many times will thus leak memory.
**Padding on wire?**
Does the protocol tend to write a lot of zero-valued padding bytes to the wire?
This is a problem with zero-copy protocols: fixed-width integers tend to have a lot of zeros in the high-order bits, and padding sometimes needs to be inserted for alignment. This padding can easily double or triple the size of a message.
Protocol Buffers avoids padding by encoding integers using variable widths, which is only possible given a separate encoding/decoding step.
SBE and FlatBuffers leave the padding in to achieve zero-copy.
Cap'n Proto normally leaves the padding in, but comes with a built-in option to apply a very fast compression algorithm called "packing" which aims only to deflate zeros. This algorithm tends to achieve similar sizes to Protobufs while still being faster (and _much_ faster than general-purpose compression).
Note that Cap'n Proto's packing algorithm would be appropriate for SBE and FlatBuffers as well. Feel free to steal it. :)
**Unset fields on wire?**
If a field has not been explicitly assigned a value, will it take any space on the wire?
Protobuf encodes tag-value pairs, so it simply skips pairs that have not been set.
Cap'n Proto and SBE position fields at fixed offsets from the start of the struct. The struct is always allocated large enough for all known fields according to the schema. So, unused fields waste space. (But Cap'n Proto's optional packing will tend to compress away this space.)
FlatBuffers uses a separate table of offsets (the vtable) to indicate the position of each field, with zero meaning the field isn't present. So, unset fields take no space on the wire -- although they do take space in the vtable. vtables can apparently be shared between instances where the offsets are all the same, amortizing this cost.
**Pointers on wire?**
Do non-primitive fields require storing a pointer?
Protobufs uses tag-length-value for variable-width fields.
Cap'n Proto uses pointers for variable-width fields, so that the size of the parent object is independent of the size of any children. These pointers take some space on the wire.
SBE requires variable-width fields to be embedded in preorder, which means pointers aren't necessary.
FlatBuffers also uses pointers, even though most objects are variable-width, possibly because the vtables only store 16-bit offsets, limiting the size of any one object.
**Platform Support**
A really huge weakness of Cap'n Proto today is that it doesn't compile in Visual Studio, and therefore effectively doesn't support Windows (unless you count Cygwin, but most people don't).
The problem initially was that Cap'n Proto makes liberal use of C++11 features, and MSVC has lagged behind in implementing them. We considered, but balked at, forking and backporting. It looks like [VS14 CTP1](http://blogs.msdn.com/b/vcblog/archive/2014/06/03/visual-studio-14-ctp.aspx) may finally be far enough to make a port practical, so now it's just a matter of finding the time. But that's a new problem: all my time is currently consumed by [Sandstorm.io](https://sandstorm.io), and while it is a major use of Cap'n Proto and will thus drive further development, it is entirely Linux-based, making it hard for me to justify spending my time on Windows support. What we need is a volunteer!
### Benchmarks?
I do not provide benchmarks. I did not provide them when I launched Protobufs, nor when I launched Cap'n Proto, even though I had some with nice numbers (which you can find in git). And I don't see any reason to start now.
Why? Because they would tell you nothing. I could easily construct a benchmark to make any given library "win", by exploiting the relative tradeoffs each one makes. I can even construct one where Protobufs -- supposedly infinitely slower than the others -- wins.
The fact of the matter is that the relative performance of these libraries depends deeply on the use case. To know which one will be fastest for _your_ project, you really need to benchmark them in _your_ project, end-to-end. No contrived benchmark will give you the answer.
With that said, my intuition is that SBE will probably edge Cap'n Proto and FlatBuffers on performance in the average case, due to its decision to forgo support for random access. Between Cap'n Proto and FlatBuffers, it's harder to say. FlatBuffers' vtable approach seems like it would make access more expensive, though its simpler pointer format may be cheaper to follow. FlatBuffers also appears to do a lot of bookkeeping at encoding time which could get costly (such as de-duping vtables), but I don't know how costly.
For most people, the performance difference is probably small enough that qualitative (feature) differences in the libraries matter more.