---
layout: page
---

# Defining Types

Like Protocol Buffers and Thrift (but unlike JSON or MessagePack), Cap'n Proto messages are
strongly-typed and not self-describing. You must define your message structure in a special
language, then invoke the Cap'n Proto compiler (`capnpc`) to generate source code to manipulate
that message type in your desired language.

For example:

{% highlight python %}
# unique file ID, generated by capnpc -i
@0xdbb9ad1f14bf0b36;

struct Person {
  name @0 :Text;
  birthdate @3 :Date;

  email @1 :Text;
  phones @2 :List(PhoneNumber);

  struct PhoneNumber {
    number @0 :Text;
    type @1 :Type;

    enum Type {
      mobile @0;
      home @1;
      work @2;
    }
  }
}

struct Date {
  year @0 :Int16;
  month @1 :UInt8;
  day @2 :UInt8;
}
{% endhighlight %}

Some notes:

* Types come after names. The name is by far the most important thing to see, especially when
  quickly skimming, so we put it up front where it is most visible.  Sorry, C got it wrong.
* The `@N` annotations show how the protocol evolved over time, so that the system can make sure
  to maintain compatibility with older versions. Fields (and enumerants, and interface methods)
  must be numbered consecutively starting from zero in the order in which they were added. In this
  example, it looks like the `birthdate` field was added to the `Person` structure recently -- its
  number is higher than the `email` and `phones` fields. Unlike Protobufs, you cannot skip numbers
  when defining fields -- but there was never any reason to do so anyway.

## Language Reference

### Comments

Comments are indicated by hash signs and extend to the end of the line:

{% highlight capnp %}
# This is a comment.
{% endhighlight %}

Comments meant as documentation should appear _after_ the declaration, either on the same line, or
on a subsequent line. Doc comments for aggregate definitions should appear on the line after the
opening brace.

{% highlight capnp %}
struct Date {
  # A standard Gregorian calendar date.

  year @0 :Int16;
  # The year.  Must include the century.
  # Negative value indicates BC.

  month @1 :UInt8;   # Month number, 1-12.
  day @2 :UInt8;     # Day number, 1-30.
}
{% endhighlight %}

Placing the comment _after_ the declaration rather than before makes the code more readable,
especially when doc comments grow long. You almost always need to see the declaration before you
can start reading the comment.

### Built-in Types

The following types are automatically defined:

* **Void:** `Void`
* **Boolean:** `Bool`
* **Integers:** `Int8`, `Int16`, `Int32`, `Int64`
* **Unsigned integers:** `UInt8`, `UInt16`, `UInt32`, `UInt64`
* **Floating-point:** `Float32`, `Float64`
* **Blobs:** `Text`, `Data`
* **Lists:** `List(T)`

Notes:

* The `Void` type has exactly one possible value, and thus can be encoded in zero bits. It is
  rarely used, but can be useful as a union member.
* `Text` is always UTF-8 encoded and NUL-terminated.
* `Data` is a completely arbitrary sequence of bytes.
* `List` is a parameterized type, where the parameter is the element type. For example,
  `List(Int32)`, `List(Person)`, and `List(List(Text))` are all valid.

### Structs

A struct has a set of named, typed fields, numbered consecutively starting from zero.

{% highlight capnp %}
struct Person {
  name @0 :Text;
  email @1 :Text;
}
{% endhighlight %}

Fields can have default values:


{% highlight capnp %}
foo @0 :Int32 = 123;
bar @1 :Text = "blah";
baz @2 :List(Bool) = [ true, false, false, true ];
qux @3 :Person = (name = "Bob", email = "bob@example.com");
corge @4 :Void = void;
{% endhighlight %}


### Unions

A union is two or more fields of a struct which are stored in the same location. Only one of
these fields can be set at a time, and a separate tag is maintained to track which one is
currently set. Unlike in C, unions are not types, they are simply properties of fields, therefore
union declarations do not look like types.

{% highlight capnp %}
struct Person {
  # ...

  employment @4 union {
    unemployed @5 :Void;
    employer @6 :Company;
    school @7 :School;
    selfEmployed @8 :Void;
    # We assume that a person is only one of these.
  }
}
{% endhighlight %}

Notes:

* Unions and their members are numbered in the same number space as fields of the containing
  struct. Remember that the purpose of the numbers is to indicate the evolution order of the
  struct. The system needs to know when the union and each of its members was declared relative to
  the non-union fields. Also note that no more than one element of the union is allowed to have a
  number less than the union's number, as unionizing two or more pre-existing fields would change
  their layout.

* Notice that we used the "useless" `Void` type here. We don't have any extra information to store
  for the `unemployed` or `selfEmployed` cases, but we still want the union to distinguish these
  states from others.

### Enums

An enum is a type with a small finite set of symbolic values.

{% highlight capnp %}
enum Rfc3092Variable {
  foo @0;
  bar @1;
  baz @2;
  qux @3;
  # ...
}
{% endhighlight %}

Like fields, enumerants must be numbered sequentially starting from zero. In languages where
enums have numeric values, these numbers will be used, but in general Cap'n Proto enums should not
be considered numeric.

### Interfaces

An interface has a collection of methods, each of which takes some parameters and returns a
result. Like struct fields, methods are numbered.

{% highlight capnp %}
interface Directory {
  list @0 () :List(FileInfo);
  create @1 (name :Text) :FileInfo;
  open @2 (name :Text) :FileInfo;
  delete @3 (name :Text) :Void;
  link @4 (name :Text, file :File) :Void;
}

struct FileInfo {
  name @0 :Text;
  size @1 :UInt64;
  file @2 :File;   # A pointer to a File.
}

interface File {
  read @0 (startAt :UInt64 = 0, amount :UInt64 = 0xffffffffffffffff) :Data;
  # Default params = read entire file.

  write @1 (startAt :UInt64, data :Data) :Void;
  truncate @2 (size :UInt64) :Void;
}
{% endhighlight %}

Notice something interesting here: `FileInfo` is a struct, but it contains a `File`, which is an
interface. Structs (and primitive types) are passed over RPC by value, but interfaces are passed by
reference. So when `Directory.open` is called remotely, the content of a `FileInfo` (including
values for `name` and `size`) is transmitted back, but for the `file` field, only the address of
some remote `File` object is sent.

When an address of an object is transmitted, the RPC system automatically manages making sure that
the recipient gets permission to call the addressed object -- because if the recipient wasn't
meant to have access, the sender shouldn't have sent the reference in the first place. This makes
it very easy to develop secure protocols with Cap'n Proto -- you almost don't need to think about
access control at all. This feature is what makes Cap'n Proto a "capability-based" RPC system -- a
reference to an object inherently represents a "capability" to access it.

### Constants

You can define constants in Cap'n Proto.  These don't affect what is sent on the wire, but they
will be included in the generated code.

{% highlight capnp %}
const pi :Float32 = 3.14159;
const bob :Person = (name = "Bob", email = "bob@example.com");
{% endhighlight %}

### Nesting, Scope, and Aliases

You can nest constant, alias, and type definitions inside structs and interfaces (but not enums).
This has no effect on any definition involved except to define the scope of its name. So in Java
terms, inner classes are always "static". To name a nested type from another scope, separate the
path with `.`s.

{% highlight capnp %}
struct Foo {
  struct Bar {
    #...
  }
  bar @0 :Bar;
}

struct Baz {
  bar @0 :Foo.Bar;
}
{% endhighlight %}

If typing long scopes becomes cumbersome, you can use `using` to declare an alias.

{% highlight capnp %}
struct Qux {
  using Foo.Bar;
  bar @0 :Bar;
}

struct Corge {
  using T = Foo.Bar;
  bar @0 :T;
}
{% endhighlight %}

### Imports

An `import` expression names the scope of some other file:

{% highlight capnp %}
struct Foo {
  # Use type "Baz" defined in bar.capnp.
  baz @0 :import "bar.capnp".Baz;
}
{% endhighlight %}

Of course, typically it's more readable to define an alias:

{% highlight capnp %}
using Bar = import "bar.capnp";

struct Foo {
  # Use type "Baz" defined in bar.capnp.
  baz @0 :Bar.Baz;
}
{% endhighlight %}

Or even:

{% highlight capnp %}
using import "bar.capnp".Baz;

struct Foo {
  baz @0 :Baz;
}
{% endhighlight %}

The above imports specify relative paths.  If the path begins with a `/`, it is absolute -- in
this case, `capnpc` searches for the file in each of the search path directories specified with
`-I`.

### Annotations

Sometimes you want to attach extra information to parts of your protocol that isn't part of the
Cap'n Proto language.  This information might control details of a particular code generator, or
you might even read it at run time to assist in some kind of dynamic message processing.  For
example, you might create a field annotation which means "hide from the public", and when you send
a message to an external user, you might invoke some code first that iterates over your message and
removes all of these hidden fields.

You may declare annotations and use them like so:

{% highlight capnp %}
# Declare an annotation 'foo' which applies to struct and enum types.
annotation foo(struct, enum) :Text;

# Apply 'foo' to to MyType.
struct MyType $foo("bar") {
  # ...
}
{% endhighlight %}

The possible targets for an annotation are: `file`, `struct`, `field`, `union`, `enum`, `enumerant`,
`interface`, `method`, `parameter`, `annotation`, `const`.  You may also specify `*` to cover them
all.

{% highlight capnp %}
# 'baz' can annotate anything!
annotation baz(*) :Int32;

$baz(1);  # Annotate the file.

struct MyStruct $baz(2) {
  myField @0 :Text = "default" $baz(3);
  myUnion @1 union $baz(4) {
    # ...
  }
}

enum MyEnum $baz(5) {
  myEnumerant @0 $baz(6);
}

interface MyInterface $baz(7) {
  myMethod(myParam :Text $baz(9)) :Void $baz(8);
}

annotation myAnnotation(struct) :Int32 $baz(10);
const myConst :Int32 = 123 $baz(11);
{% endhighlight %}

`Void` annotations can omit the value.  Struct-typed annotations are also allowed.

{% highlight capnp %}
annotation qux(struct, field) :Void;

struct MyStruct $qux {
  string $0 :Text $qux;
  number $1 :Int32 $qux;
}

annotation corge(file) :MyStruct;

$corge(string = "hello", number = 123);
{% endhighlight %}

### Unique IDs

A Cap'n Proto file must have a unique 64-bit ID, and each type and annotation defined therein may
also have an ID.  Use `capnpc -i` to generate a new ID randomly.  ID specifications begin with `@`:

{% highlight capnp %}
# file ID
@0xdbb9ad1f14bf0b36;

struct Foo @0x8db435604d0d3723 {
  # ...
}

enum Bar @0xb400f69b5334aab3 {
  # ...
}

interface Baz @0xf7141baba3c12691 {
  # ...
}

annotation qux @0xf8a1bedf44c89f00 (field) :Text;
{% endhighlight %}

If you omit the ID for a type or annotation, one will be assigned automatically.  This default
ID is derived by taking the first 8 bytes of the MD5 hash of the parent scope's ID concatenated
with the declaration's name (where the "parent scope" is the file for top-level delarations, or
the outer type for nested declarations).  You can see the automatically-generated IDs by running
`capnpc -v` on a file.  In general, you would only specify an explicit ID for a declaration if that
declaration has been renamed or moved and you want the ID to stay the same for
backwards-compatibility.

IDs exist to provide a relatively short yet unambiguous way to refer to a type or annotation from
another context.  They may be used for representing schemas, for tagging dynamically-typed fields,
etc.  Most languages prefer instead to define a symbolic global namespace e.g. full of "packages",
but this would have some important disadvantages in the context of Cap'n Proto:

* Programmers often feel the need to change symbolic names and organization in order to make their
  code cleaner.
* It's easy for symbolic names to collide, and these collisions could be hard to detect in a large
  distributed system with many different binaries using different versions of protocols.
* Fully-qualified type names may be large and waste space when transmitted on the wire.

Note that IDs are 64-bit (actually, 63-bit, as the first bit is always 1).  Random collisions
are possible, but unlikely -- there would have to be on the order of a billion types before this
becomes a real concern.  Collisions from misuse (e.g. copying an example without changing the ID)
are much more likely.

## Advanced Topics

### Dynamically-typed Fields

A struct may have a field with type `Object`.  This field's value can be of any pointer type -- i.e.
any struct, interface, list, or blob.  This is essentially like a `void*` in C.

### Inlining Structs

Say you have a small struct which you know will never add new fields.  For efficiency, you may want
instance of this struct to be "inlined" into larger structs where it is used.  This saves eight
bytes of space per usage (the size of a pointer) and may improve cache locality.

To inline a struct, you must first declare that it has fixed-width, and specify the sizes of its
data and pointer sections:

{% highlight capnp %}
struct Point16 fixed(4 bytes) {
  x @0 :UInt16;
  y @1 :UInt16;
}

struct Name fixed(2 pointers) {
  first @0 :Text;
  last @1 :Text;
}

struct TextWithHash fixed(8 bytes, 1 pointers) {
  hash @0 :UInt64;
  text @1 :Text;
}
{% endhighlight %}

The compiler will produce an error if the specified size is too small to hold the defined fields,
so if you are unsure how much space you need, simply delcare your struct `fixed()` and the compiler
will tell you.

Once you have a fixed-width struct, you must explicitly declare it `Inline` at the usage site:

{% highlight capnp %}
struct Foo {
  a @0 :Point16;          # NOT inlined
  b @1 :Inline(Point16);  # inlined!
}
{% endhighlight %}

### Inlining Lists and Data

You may also inline fixed-length lists and data.

{% highlight capnp %}
struct Foo {
  sha1Hash @0 :InlineData(20);  # 160-bit fixed-width.

  vertex3 @1 :InlineList(Float32, 3);  # x, y, and z coordinates.

  vertexList @2 :List(InlineList(Float32, 3));
  # Much more efficient than List(List(Float32))!
}
{% endhighlight %}

At this time, there is no `InlineText` because text almost always has variable length.

## Evolving Your Protocol

A protocol can be changed in the following ways without breaking backwards-compatibility:

* New types, constants, and aliases can be added anywhere, since they obviously don't affect the
  encoding of any existing type.
* New fields, values, and methods may be added to structs, enums, and interfaces, respectively,
  with the numbering rules described earlier.
* New parameters may be added to a method.  The new parameters must be added to the end of the
  parameter list and must have default values.
* Any symbolic name can be changed, as long as the ordinal numbers stay the same.
* Types definitions can be moved to different scopes.
* A field of type `List(T)`, where `T` is a primitive type, non-inline blob, or
  non-inline list, may be changed to type `List(U)`, where `U` is a struct type whose `@0` field is
  of type `T`.  This rule is useful when you realize too late that you need to attach some extra
  data to each element of your list.  Without this rule, you would be stuck defining parallel
  lists, which are ugly and error-prone.
* A struct that is not already `fixed` can be made `fixed`.  However, once a struct is declared
  `fixed`, the declaration cannot be removed or changed, as this would change the layout of `Inline`
  uses of the struct.

Any other change should be assumed NOT to be safe.  Also, these rules only apply to the Cap'n Proto
native encoding.  It is sometimes useful to transcode Cap'n Proto types to other formats, like
JSON, which may have different rules (e.g., field names cannot change in JSON).