Lots of documentation updates for 0.5.

425657bb · Kenton Varda · ab1fc490 · 425657bb · 425657bb · 425657bb
Commit 425657bb authored Dec 11, 2013 by Kenton Varda
9 changed files
--- a/doc/_includes/header.html
+++ b/doc/_includes/header.html
@@ -43,6 +43,7 @@
          <li><a href="{{ site.baseurl }}cxxrpc.html">C++ RPC</a></li>
          <li><a href="{{ site.baseurl }}otherlang.html">Other Languages</a></li>
          <li><a href="{{ site.baseurl }}roadmap.html">Road Map</a></li>
+          <li><a href="{{ site.baseurl }}faq.html">FAQ</a></li>
        </ul>
      </section>
      <section id="main_content" class="inner">
--- a/doc/cxx.md
+++ b/doc/cxx.md
 ---
 layout: page
-title: C++ Runtime
+title: C++ Serialization
 ---

 # C++ Serialization

--- a/doc/cxxrpc.md
+++ b/doc/cxxrpc.md
 ---
 layout: page
-title: C++ Runtime
+title: C++ RPC
 ---

 # C++ RPC
@@ -402,3 +402,9 @@ path name.

 For a more complete example, see the
 [calculator server sample](https://github.com/kentonv/capnproto/tree/master/c++/samples/calculator-server.c++).
+
+## Debugging
+
+If you've written a server and you want to connect to it to issue some calls for debugging, perhaps
+interactively, the easiest way to do it is to use [pycapnp](http://jparyani.github.io/pycapnp/).
+The `capnp` tool probably will never add RPC functionality because pycapnp is better.
--- a/doc/faq.md
+++ b/doc/faq.md
+---
+layout: page
+title: FAQ
+---
+
+# FAQ
+
+## Design
+
+### Isn't I/O bandwidth more important than CPU usage?  Is Cap'n Proto barking up the wrong tree?
+
+It depends.  What is your use case?
+
+Are you communicating between two processes on the same machine?  If so, you have unlimited
+bandwidth, and you should be entirely concerned with CPU.
+
+Are you communicating between two machines within the same datacenter?  If so, it's unlikely that
+you will saturate your network connection before your CPU.  Possible, but unlikely.
+
+Are you communicating across the general internet?  In that case, bandwidth is probably your main
+concern.  Luckily, Cap'n Proto lets you choose to enable "packing" in this case, achieving similar
+encoding size to Protocol Buffers while still being faster.  And you can always add extra
+compression on top of that.
+
+### Have you considered building the RPC system on ZeroMQ?
+
+ZeroMQ (and its successor, Nanomsg) is a powerful technology for distributed computing.  Its
+design focuses on scenarios involving lots of stateless, fault-tolerant worker processes
+communicating via various patterns, such as request/response, produce/consume, and
+publish/subscribe.  For big data processing where armies of stateless nodes make sense, pairing
+Cap'n Proto with ZeroMQ would be an excellent choice -- and this is easy to do today, as ZeroMQ
+is entirely serialization-agnostic.
+
+That said, Cap'n Proto RPC takes a very different approach.  Cap'n Proto's model focuses on
+stateful servers interacting in complex, object-oriented ways.  The model is better suited to
+tasks involving applications with many heterogeneous components and interactions between
+mutually-distrusting parties.  Requests and responses can go in any direction.  Objects have
+state and two calls to the same object had best be implemented on the same machine.  Fault
+tolerance is pushed up the stack, because without a large pool of homogeneous work there's just
+no way to make it transparent at a low level.
+
+Put concretely, you might build a search engine on ZeroMQ, but an online interactive spreadsheet
+editor would be better built on Cap'n Proto RPC.
+
+### Aren't messages that contain pointers a huge security problem?
+
+Not at all.  Cap'n Proto bounds-checks each pointer when it is read and throws an exception or
+returns a safe dummy value (your choice) if the pointer is out-of-bounds.
+
+### So it's not that you've eliminated parsing, you've just moved it to happen lazily?
+
+No.  Compared to Protobuf decoding, the time spent validating pointers while traversing a Cap'n
+Proto message is negligible.
+
+### I think I heard somewhere that capability-based security doesn't work?
+
+This was a popular myth in security circles way back in the 80's and 90's, based on an incomplete
+understanding of how to use capabilities effectively.  Read
+[Capability Myths Demolished](http://srl.cs.jhu.edu/pubs/SRL2003-02.pdf).
+
+## Usage
+
+### How do I make a field "required", like in Protocol Buffers?
+
+You don't.  You may find this surprising, but the "required" keyword in Protocol Buffers turned
+out to be a horrible mistake.
+
+For background, in protocol buffers, a field could be marked "required" to indicate that parsing
+should fail if the sender forgot to set the field before sending the message.  Required fields were
+encoded exactly the same as optional ones; the only difference was the extra validation.
+
+The problem with this is, validation is sometimes more subtle than that.  Sometimes, different
+applications -- or different parts of the same application, or different versions of the same
+application -- place different requirements on the same protocol.  An application may want to
+pass around partially-complete messages internally.  A particular field that used to be required
+might become optional.  A new use case might call for almost exactly the same message type, minus
+one field, at which point it may make more sense to reuse the type than to define a new one.
+
+A field declared required, unfortunately, is required everywhere.  The validation is baked into
+the parser, and there's nothing you can do about it.  Nothing, that is, except change the field
+from "required" to "optional".  But that's where the _real_ problems start.
+
+Imagine a production environment in which two servers, Alice and Bob, exchange messages through a
+message bus infrastructure running on a big corporate network.  The message bus parses each message
+just to examine the envelope and decide how to route it, without paying attention to any other
+content.  Often, messages from various applications are batched together and then split up again
+downstream.
+
+Now, at some point, Alice's developers decide that one of the fields in a deeply-nested message
+commonly sent to Bob has become obsolete.  To clean things up, they decide to remove it, so they
+change the field from "required" to "optional".  The developers aren't idiots, so they realize that
+Bob needs to be updated as well.  They make the changes to Bob, and just to be thorough they
+run an integration test with Alice and Bob running in a test environment.  The test environment
+is always running the latest build of the message bus, but that's irrelevant anyway because the
+message bus doesn't actually care about message contents; it only does routing.  Protocols are
+modified all the time without updating the message bus.
+
+Satisfied with their testing, the devs push a new version of Alice to prod.  Immediately,
+everything breaks.  And by "everything" I don't just mean Alice and Bob.  Completely unrelated
+servers are getting strange errors or failing to receive messages.  The whole data center has
+ground to a halt and the sysadmins are running around with their hair on fire.
+
+What happened?  Well, the message bus running in prod was still an older build from before the
+protocol change.  And even though the message bus doesn't care about message content, it _does_
+need to parse every message just to read the envelope.  And the protobuf parser checks the _entire_
+message for missing required fields.  So when Alice stopped sending that newly-optional field, the
+whole message failed to parse, envelope and all.  And to make matters worse, any other messages
+that happened to be in the same batch _also_ failed to parse, causing errors in seemingly-unrelated
+systems that share the bus.
+
+Things like this have actually happened.  At Google.  Many times.
+
+The right answer is for applications to do validation as-needed in application-level code.  If you
+want to detect when a client fails to set a particular field, give the field an invalid default
+value and then check for that value on the server.  Low-level infrastructure that doesn't care
+about message content should not validate it at all.
+
+Oh, and also, Cap'n Proto doesn't have any parsing step during which to check for required
+fields.  :)
+
+### How do I make a field optional?
+
+Cap'n Proto has no notion of "optional" fields.
+
+A primitive field always takes space on the wire whether you set it or not (although default-valued
+fields will be compressed away if you enable packing).  Such a field can be made semantically
+optional by placing it in a union with a `Void` field:
+
+{% highlight capnp %}
+union {
+  age @0 :Int32;
+  ageUnknown @1 :Void;
+}
+{% endhighlight %}
+
+However, this field still takes space on the wire, and in fact takes an extra 16 bits of space
+for the union tag.  A better approach may be to give the field a bogus default value and interpret
+that value to mean "not present".
+
+Pointer fields are a bit different.  They start out "null", and you can check for nullness using
+the `hasFoo()` accessor.  You could use a null pointer to mean "not present".  Note, though, that
+calling `getFoo()` on a null pointer returns the default value, which is indistinguishable from a
+legitimate value.  The receiver of the message therefore needs to explicitly check `hasFoo()`
+before calling the getter.
+
+### How do I resize a list?
+
+Unfortunately, you can't.  You have to know the size of your list upfront, before you initialize
+any of the elements.  This is an annoying side effect of arena allocation, which is a fundamental
+part of Cap'n Proto's design:  in order to avoid making a copy later, all of the pieces of the
+message must be allocated in a tightly-packed segment of memory, with each new piece being added
+to the end.  If a previously-allocated piece is discarded, it leaves a hole, which wastes space.
+Since Cap'n Proto lists are flat arrays, the only way to resize a list would be to discard the
+existing list and allocate a new one, which would thus necessarily waste space.
+
+In theory, a more complicated memory allocation algorithm could attempt to reuse the "holes" left
+behind by discarded message pieces.  However, it would be hard to make sure any new data inserted
+into the space is exactly the right size.  Fragmentation would result.  And the allocator would
+have to do a lot of extra bookkeeping that could be expensive.  This would be sad, as arena
+allocation is supposed to be cheap!
+
+The only solution is to temporarily place your data into some other data structure (an
+`std::vector`, perhaps) until you know how many elements you have, then allocate the list and copy.
+On the bright side, you probably aren't losing much performance this way -- using vectors already
+involves making copies every time the backing array grows.  It's just annoying to code.
+
+Keep in mind that you can use [orphans](cxx.html#orphans) to allocate sub-objects before you have
+a place to put them.  But, also note that you cannot allocate elements of a struct list as orphans
+and then put them together as a list later, because struct lists are encoded as a flat array of
+struct values, not an array of pointers to struct values.  You can, however, allocate any inner
+objects embedded within those structs as orphans.
+
+## Personal
+
+### Who is paying you?
+
+Nobody.  That is, aside from [Gittip](https://www.gittip.com/kentonv/) contributors (thanks!).
+
+### Can I hire you?
+
+If you would like to purchase by services on a contract basis specifically to work on features you
+need in Cap'n Proto or to help you apply Cap'n Proto within your business, I'd be happy to talk.
+
+If you are simply looking for engineers in general and are interested in me because you are
+impressed with my work, I am flattered.  It is especially nice when such inquiries come from other
+engineers rather than from headhunters.  However, I am not looking for employment at this time,
+unless it is directly advancing Cap'n Proto.
--- a/doc/install.md
+++ b/doc/install.md
@@ -22,9 +22,9 @@ you should keep in mind some caveats:
 * **Performance:** While Cap'n Proto is inherently fast by design, the implementation has not yet
  undergone serious profiling and optimization.  Currently it only beats Protobufs in realistic-ish
  end-to-end benchmarks by around 2x-5x.  We can do better.
-* **RPC:** The RPC protocol has not yet been specified, much less implemented.
-* **Support for languages other than C++:** Work is being done to support languages other than C++,
-  but at this time only the C++ implementation is ready to be used.
+* **RPC:** The RPC implementation is very new (introduced in v0.4 / Dec 2013).  It is missing many
+  features that are essential in real-world use (like timeouts), the interface is still in flux,
+  and it needs a lot of optimization work.

 If you'd like to hack on Cap'n Proto, you should join the
 [discussion group](https://groups.google.com/group/capnproto)!
@@ -38,6 +38,9 @@ The Cap'n Proto tools, including the compiler which takes `.capnp` files and gen
 for them, are written in C++.  Therefore, you must install the C++ package even if your actual
 development language is something else.

+This package is licensed under the
+[BSD 2-Clause License](http://opensource.org/licenses/BSD-2-Clause).
+
 ### GCC 4.7 or Clang 3.2 Needed

 If you are using GCC, you MUST use at least version 4.7 as Cap'n Proto uses recently-implemented
@@ -47,18 +50,21 @@ need to set the environment variable `CXX=g++-4.7` before following the instruct
 If you are using Clang, you must use at least version 3.2.  To use Clang, set the environment
 variable `CXX=clang++` before following any instructions below, otherwise `g++` is used by default.

-This package is officially tested on Linux (GCC 4.7, Clang 3.2), Mac OSX (Clang 3.2), and Cygwin
-(Windows; GCC 4.7), in 32-bit and 64-bit modes.
+This package is officially tested on Linux (GCC 4.7, GCC 4.8, Clang 3.2), Mac OSX (Xcode 5), and
+Cygwin (Windows; GCC 4.8), in 32-bit and 64-bit modes.

-**Mac OSX users:**  Don't miss the [special instructions for OSX](#clang_32_on_mac_osx).
+Mac/Xcode users:  You must use at least Xcode 5, and you must download the Xcode command-line tools
+under Xcode menu > Preferences > Downloads.  Alternatively, compiler builds from
+[Macports](http://www.macports.org/), [Fink](http://www.finkproject.org/), or
+[Homebrew](http://brew.sh/) are reported to work.

 ### Building from a release package

 You may download and install the release version of Cap'n Proto like so:

-<pre><code>curl -O <a href="http://capnproto.org/capnproto-c++-0.2.1.tar.gz">http://capnproto.org/capnproto-c++-0.2.1.tar.gz</a>
-tar zxf capnproto-c++-0.2.1.tar.gz
-cd capnproto-c++-0.2.1
+<pre><code>curl -O <a href="http://capnproto.org/capnproto-c++-0.0.0.tar.gz">http://capnproto.org/capnproto-c++-0.0.0.tar.gz</a>
+tar zxf capnproto-c++-0.0.0.tar.gz
+cd capnproto-c++-0.0.0
 ./configure
 make -j6 check
 sudo make install</code></pre>
@@ -85,46 +91,6 @@ installed (in addition to Git) in order to fetch the Google Test sources (done b
    make -j6 check
    sudo make install

-### Clang 3.2 on Mac OSX
-
-As of this writing, Mac OSX 10.8 with Xcode 4.6 command-line tools is not quite good enough to
-compile Cap'n Proto.  The included version of GCC is ancient.  The included version of Clang --
-which mysteriously advertises itself as version 4.2 -- was actually cut from LLVM SVN somewhere
-between versions 3.1 and 3.2; it is not sufficient to build Cap'n Proto.
-
-There are two options:
-
-1. Use [Macports](http://www.macports.org/), [Fink](http://www.finkproject.org/), or
-   [Homebrew](http://brew.sh/) to get an up-to-date GCC.
-2. Obtain Clang 3.2
-   [directly from the LLVM project](http://llvm.org/releases/download.html).  (Unfortunately,
-   Clang 3.3 apparently does NOT work, because the libc++ headers shipped with XCode contain
-   bugs that Clang 3.3 refuses to compile.)
-
-Option 2 is the one preferred by Cap'n Proto's developers.  Here are step-by-step instructions
-for setting this up:
-
-1. Get the Xcode command-line tools:  Download Xcode from the app store.  Then, open Xcode,
-   go to Xcode menu > Preferences > Downloads, and choose to install "Command Line Tools".
-2. Download the Clang 3.2 binaries and put them somewhere easy to remember:
-
-       curl -O http://llvm.org/releases/3.2/clang+llvm-3.2-x86_64-apple-darwin11.tar.gz
-       tar zxf clang+llvm-3.2-x86_64-apple-darwin11.tar.gz
-       mv clang+llvm-3.2-x86_64-apple-darwin11 ~/clang-3.2
-
-3. We will need to use libc++ (from LLVM) rather than libstdc++ (from GNU) because Xcode's
-   libstdc++ (like its GCC) is too old.  In order for your freshly-downloaded Clang binaries to
-   be able to find it, you'll need to symlink it into the Clang tree:
-
-       ln -s /usr/lib/c++ ~/clang-3.2/lib/c++
-
-You may now follow the instructions below, but make sure to tell `configure` to use your
-newly-downloaded Clang binary:
-
-    ./configure CXX=$HOME/clang-3.2/bin/clang++
-
-Hopefully, Xcode 5.0 will be released soon with a newer Clang, making this extra work unnecessary.
-
 ### Building with Ekam

 Ekam is a build system I wrote a while back that automatically figures out how to build your C++

--- a/doc/language.md
+++ b/doc/language.md
@@ -280,8 +280,8 @@ group" in Cap'n Proto, which was the case that got into the most trouble with Pr

 ### Dynamically-typed Fields

-A struct may have a field with type `Object`.  This field's value can be of any pointer type -- i.e.
-any struct, interface, list, or blob.  This is essentially like a `void*` in C.
+A struct may have a field with type `AnyPointer`.  This field's value can be of any pointer type --
+i.e. any struct, interface, list, or blob.  This is essentially like a `void*` in C.

 ### Enums

@@ -498,7 +498,9 @@ annotation myAnnotation(struct) :Int32 $baz(10);
 const myConst :Int32 = 123 $baz(11);
 {% endhighlight %}

-`Void` annotations can omit the value.  Struct-typed annotations are also allowed.
+`Void` annotations can omit the value.  Struct-typed annotations are also allowed.  Tip:  If
+you want an annotation to have a default value, declare it as a struct with a single field with
+a default value.

 {% highlight capnp %}
 annotation qux(struct, field) :Void;
@@ -511,6 +513,15 @@ struct MyStruct $qux {
 annotation corge(file) :MyStruct;

 $corge(string = "hello", number = 123);
+
+struct Grault {
+  value @0 :Int32 = 123;
+}
+
+annotation grault(file) :Grault;
+
+$grault();  # value defaults to 123
+$grault(value = 456);
 {% endhighlight %}

 ### Unique IDs

--- a/doc/otherlang.md
+++ b/doc/otherlang.md
@@ -17,14 +17,20 @@ maintained by respective authors and have not been reviewed by me

 ##### Works In Progress

+* [C](https://github.com/jmckaskill/c-capnproto) by [@jmckaskill](https://github.com/jmckaskill)
 * [Erlang](http://ecapnp.astekk.se/) by [@kaos](https://github.com/kaos)
+* [Go](https://github.com/jmckaskill/go-capnproto) by [@jmckaskill](https://github.com/jmckaskill)
 * [Ruby](https://github.com/cstrahan/capnp-ruby) by [@cstrahan](https://github.com/cstrahan)
 * [Rust](https://github.com/dwrensha/capnproto-rust) by [@dwrensha](https://github.com/dwrensha)

-##### Inactive
+##### Non-language Projects

-* [C and Go](https://github.com/jmckaskill/go-capnproto) by
-  [@jmckaskill](https://github.com/jmckaskill)
+These are other misc projects related to Cap'n Proto that are not actually implementations in
+new languages.
+
+* [Common Test Framework](https://github.com/kaos/capnp_test) by [@kaos](https://github.com/kaos)
+* [Vim Syntax Highlighting](https://github.com/cstrahan/vim-capnp) by
+  [@cstrahan](https://github.com/cstrahan)

 ## Contribute Your Own!


--- a/doc/roadmap.md
+++ b/doc/roadmap.md
@@ -5,23 +5,26 @@ title: Road Map

 # Road Map

-Here's what's (hopefully) in store for future versions of Cap'n Proto!
+Here's what's (hopefully) in store for future versions of Cap'n Proto!  Of course, everything here
+is subject to change.

-## Next version: 0.4
+## Next version: 0.5

-These features are planned for version 0.4.  Some may get pushed back depending on how long
-they take.
+* **Shared memory RPC:**  Zero-copy inter-process communication.  Synchronized with futexes.
+* **Persistent capabilities (level 2 RPC):**  Allow system-assisted saving and restoring of
+  capabilities across connections.
+* **Three-way introductions (level 3 RPC):**  Allow RPC interactions between more than two parties,
+  with new connections formed automatically as needed.
+* **Fiber-based concurrency:**  The C++ runtime's event loop concurrency model will be augmented
+  with support for fibers, which are like threads except that context switches happen only at
+  well-defined points (thus avoiding the need for mutex locking).  Fibers essentially provide
+  syntax sugar on top of the event loop model.
+* **Dynamic schema transmission:**  Allow e.g. Python applications to obtain schemas directly from
+  the RPC server so that they need not have a local copy.  Great for interactive debugging.

-* **Generate code for interfaces**
-* **Define RPC protocol**
-* **Implement RPC transports**
-  * **Stream:**  Standard TCP-based RPC.
-  * **Datagram:**  Low-latency UDP-based RPC.
-  * **Shared memory:**  Zero-copy inter-process communication.  Synchronized with futexes.
+## Near future (after 0.5)

-## Near future (after 0.4)
-
-Provisionally, these are probably the things that will be worked on after 0.4.
+Provisionally, these are probably the things that will be worked on after 0.5.

 * **C++98 Backport:**  Fork and backport the key functionality from the C++11 implementation to
  work under C++98/C++03.  This will make Cap'n Proto accessible to MSVC users.  The schema and
@@ -33,10 +36,6 @@ Provisionally, these are probably the things that will be worked on after 0.4.
  nicer interface which encapsulates the type's inner state.
 * **Implement maps:**  Based on encapsulated and parameterized types.

-Note also that after 0.4, Kenton plans to begin devoting some of his time to another project
-built on top of Cap'n Proto.  Experience from this project will help guide future changes to
-Cap'n Proto itelf.
-
 ## Before version 1.0

 These things absolutely must happen before any 1.0 release.  Note that it's not yet decided when

--- a/doc/rpc.md
+++ b/doc/rpc.md
@@ -22,7 +22,7 @@ on its result, i.e. `bar(foo())`.  Or -- as is very common in object-oriented pr
 want to call a method on the result of another call, i.e. `foo().bar()`.  With any traditional RPC
 system, this will require two network round trips.  With Cap'n Proto, it takes only one.  In fact,
 you can chain any number of such calls together -- with diamond dependencies and everything -- and
-Cap'n Proto will collapse them all into one call.
+Cap'n Proto will collapse them all into one round trip.

 By now you can probably imagine how it works:  if you execute `bar(foo())`, the client sends two
 messages to the server, one saying "Please execute foo()", and a second saying "Please execute