language.md 22.3 KB
Newer Older
Kenton Varda's avatar
Kenton Varda committed
1 2
---
layout: page
Kenton Varda's avatar
Kenton Varda committed
3
title: Schema Language
Kenton Varda's avatar
Kenton Varda committed
4 5
---

Kenton Varda's avatar
Kenton Varda committed
6
# Schema Language
Kenton Varda's avatar
Kenton Varda committed
7 8 9

Like Protocol Buffers and Thrift (but unlike JSON or MessagePack), Cap'n Proto messages are
strongly-typed and not self-describing. You must define your message structure in a special
10 11
language, then invoke the Cap'n Proto compiler (`capnp compile`) to generate source code to
manipulate that message type in your desired language.
Kenton Varda's avatar
Kenton Varda committed
12 13 14

For example:

Kenton Varda's avatar
Kenton Varda committed
15
{% highlight capnp %}
Kenton Varda's avatar
Kenton Varda committed
16
@0xdbb9ad1f14bf0b36;  # unique file ID, generated by `capnp id`
17

Kenton Varda's avatar
Kenton Varda committed
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
struct Person {
  name @0 :Text;
  birthdate @3 :Date;

  email @1 :Text;
  phones @2 :List(PhoneNumber);

  struct PhoneNumber {
    number @0 :Text;
    type @1 :Type;

    enum Type {
      mobile @0;
      home @1;
      work @2;
    }
  }
}

struct Date {
  year @0 :Int16;
  month @1 :UInt8;
  day @2 :UInt8;
}
{% endhighlight %}

Some notes:

Kenton Varda's avatar
Kenton Varda committed
46 47
* Types come after names. The name is by far the most important thing to see, especially when
  quickly skimming, so we put it up front where it is most visible.  Sorry, C got it wrong.
Kenton Varda's avatar
Kenton Varda committed
48
* The `@N` annotations show how the protocol evolved over time, so that the system can make sure
49
  to maintain compatibility with older versions. Fields (and enumerants, and interface methods)
Kenton Varda's avatar
Kenton Varda committed
50 51 52 53 54 55 56 57 58
  must be numbered consecutively starting from zero in the order in which they were added. In this
  example, it looks like the `birthdate` field was added to the `Person` structure recently -- its
  number is higher than the `email` and `phones` fields. Unlike Protobufs, you cannot skip numbers
  when defining fields -- but there was never any reason to do so anyway.

## Language Reference

### Comments

59
Comments are indicated by hash signs and extend to the end of the line:
Kenton Varda's avatar
Kenton Varda committed
60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99

{% highlight capnp %}
# This is a comment.
{% endhighlight %}

Comments meant as documentation should appear _after_ the declaration, either on the same line, or
on a subsequent line. Doc comments for aggregate definitions should appear on the line after the
opening brace.

{% highlight capnp %}
struct Date {
  # A standard Gregorian calendar date.

  year @0 :Int16;
  # The year.  Must include the century.
  # Negative value indicates BC.

  month @1 :UInt8;   # Month number, 1-12.
  day @2 :UInt8;     # Day number, 1-30.
}
{% endhighlight %}

Placing the comment _after_ the declaration rather than before makes the code more readable,
especially when doc comments grow long. You almost always need to see the declaration before you
can start reading the comment.

### Built-in Types

The following types are automatically defined:

* **Void:** `Void`
* **Boolean:** `Bool`
* **Integers:** `Int8`, `Int16`, `Int32`, `Int64`
* **Unsigned integers:** `UInt8`, `UInt16`, `UInt32`, `UInt64`
* **Floating-point:** `Float32`, `Float64`
* **Blobs:** `Text`, `Data`
* **Lists:** `List(T)`

Notes:

Kenton Varda's avatar
Kenton Varda committed
100
* The `Void` type has exactly one possible value, and thus can be encoded in zero bits. It is
Kenton Varda's avatar
Kenton Varda committed
101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138
  rarely used, but can be useful as a union member.
* `Text` is always UTF-8 encoded and NUL-terminated.
* `Data` is a completely arbitrary sequence of bytes.
* `List` is a parameterized type, where the parameter is the element type. For example,
  `List(Int32)`, `List(Person)`, and `List(List(Text))` are all valid.

### Structs

A struct has a set of named, typed fields, numbered consecutively starting from zero.

{% highlight capnp %}
struct Person {
  name @0 :Text;
  email @1 :Text;
}
{% endhighlight %}

Fields can have default values:

{% highlight capnp %}
foo @0 :Int32 = 123;
bar @1 :Text = "blah";
baz @2 :List(Bool) = [ true, false, false, true ];
qux @3 :Person = (name = "Bob", email = "bob@example.com");
corge @4 :Void = void;
{% endhighlight %}

### Unions

A union is two or more fields of a struct which are stored in the same location. Only one of
these fields can be set at a time, and a separate tag is maintained to track which one is
currently set. Unlike in C, unions are not types, they are simply properties of fields, therefore
union declarations do not look like types.

{% highlight capnp %}
struct Person {
  # ...

Kenton Varda's avatar
Kenton Varda committed
139 140 141 142 143
  employment :union {
    unemployed @4 :Void;
    employer @5 :Company;
    school @6 :School;
    selfEmployed @7 :Void;
144 145
    # We assume that a person is only one of these.
  }
Kenton Varda's avatar
Kenton Varda committed
146 147 148
}
{% endhighlight %}

Kenton Varda's avatar
Kenton Varda committed
149 150 151 152 153 154 155 156 157 158 159 160 161 162 163
Additionally, unions can be unnamed.  Each struct can contain no more than one unnamed union.  Use
unnamed unions in cases where you would struggle to think of an appropriate name for the union,
because the union represents the main body of the struct.

{% highlight capnp %}
struct Shape {
  area @0 :Float64;

  union {
    circle @1 :Float64;      # radius
    square @2 :Float64;      # width
  }
}
{% endhighlight %}

Kenton Varda's avatar
Kenton Varda committed
164 165
Notes:

Kenton Varda's avatar
Kenton Varda committed
166 167 168 169
* Unions members are numbered in the same number space as fields of the containing struct.
  Remember that the purpose of the numbers is to indicate the evolution order of the
  struct. The system needs to know when the union fields were declared relative to the non-union
  fields.
Kenton Varda's avatar
Kenton Varda committed
170 171

* Notice that we used the "useless" `Void` type here. We don't have any extra information to store
Kenton Varda's avatar
Kenton Varda committed
172 173
  for the `unemployed` or `selfEmployed` cases, but we still want the union to distinguish these
  states from others.
Kenton Varda's avatar
Kenton Varda committed
174

Kenton Varda's avatar
Kenton Varda committed
175 176 177 178
* By default, when a struct is initialized, the lowest-numbered field in the union is "set".  If
  you do not want any field set by default, simply declare a field called "unset" and make it the
  lowest-numbered field.

Kenton Varda's avatar
Kenton Varda committed
179
* You can move an existing field into a new union without breaking compatibility with existing
Kenton Varda's avatar
Kenton Varda committed
180
  data, as long as all of the other fields in the union are new.  Since the existing field is
Kenton Varda's avatar
Kenton Varda committed
181
  necessarily the lowest-numbered in the union, it will be the union's default field.
Kenton Varda's avatar
Kenton Varda committed
182

183 184 185 186 187
**Wait, why aren't unions first-class types?**

Requiring unions to be declared inside a struct, rather than living as free-standing types, has
some important advantages:

Kenton Varda's avatar
Kenton Varda committed
188 189 190 191 192
* If unions were first-class types, then union members would clearly have to be numbered separately
  from the containing type's fields.  This means that the compiler, when deciding how to position
  the union in its containing struct, would have to conservatively assume that any kind of new
  field might be added to the union in the future.  To support this, all unions would have to
  be allocated as separate objects embedded by pointer, wasting space.
193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212

* A free-standing union would be a liability for protocol evolution, because no additional data
  can be attached to it later on.  Consider, for example, a type which represents a parser token.
  This type is naturally a union: it may be a keyword, identifier, numeric literal, quoted string,
  etc.  So the author defines it as a union, and the type is used widely.  Later on, the developer
  wants to attach information to the token indicating its line and column number in the source
  file.  Unfortunately, this is impossible without updating all users of the type, because the new
  information ought to apply to _all_ token instances, not just specific members of the union.  On
  the other hand, if unions must be embedded within structs, it is always possible to add new
  fields to the struct later on.

* When evolving a protocol it is common to discover that some existing field really should have
  been enclosed in a union, because new fields being added are mutually exclusive with it.  With
  Cap'n Proto's unions, it is actually possible to "retroactively unionize" such a field without
  changing its layout.  This allows you to continue being able to read old data without wasting
  space when writing new data.  This is only possible when unions are declared within their
  containing struct.

Cap'n Proto's unconventional approach to unions provides these advantages without any real down
side:  where you would conventionally define a free-standing union type, in Cap'n Proto you
Kenton Varda's avatar
Kenton Varda committed
213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235
may simply define a struct type that contains only that union (probably unnamed), and you have
achieved the same effect.  Thus, aside from being slightly unintuitive, it is strictly superior.

### Groups

A group is a set of fields that are encapsulated in their own scope.

{% highlight capnp %}
struct Person {
  # ...

  # Note:  This is a terrible way to use groups, and meant
  #   only to demonstrate the syntax.
  address :group {
    houseNumber @8 :UInt32;
    street @9 :Text;
    city @10 :Text;
    country @11 :Text;
  }
}
{% endhighlight %}

Interface-wise, the above group behaves as if you had defined a nested struct called `Address` and
Kenton Varda's avatar
Kenton Varda committed
236
then a field `address :Address`.  However, a group is _not_ a separate object from its containing
Kenton Varda's avatar
Kenton Varda committed
237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279
struct: the fields are numbered in the same space as the containing struct's fields, and are laid
out exactly the same as if they hadn't been grouped at all.  Essentially, a group is just a
namespace.

Groups on their own (as in the above example) are useless, almost as much so as the `Void` type.
They become interesting when used together with unions.

{% highlight capnp %}
struct Shape {
  area @0 :Float64;

  union {
    circle :group {
      radius @1 :Float64;
    }
    rectangle :group {
      width @2 :Float64;
      height @3 :Float64;
    }
  }
}
{% endhighlight %}

There are two main reason to use groups with unions:

1. They are often more self-documenting.  Notice that `radius` is now a member of `circle`, so
   we don't need a comment to explain that the value of `circle` is its radius.
2. You can add additional members later on, without breaking compatibility.  Notice how we upgraded
   `square` to `rectangle` above, adding a `height` field.  This definition is actually
   wire-compatible with the previous version of the `Shape` example from the "union" section
   (aside from the fact that `height` will always be zero when reading old data -- hey, it's not
   a perfect example).  In real-world use, it is common to realize after the fact that you need to
   add some information to a struct that only applies when one particular union field is set.
   Without the ability to upgrade to a group, you would have to define the new field separately,
   and have it waste space when not relevant.

Note that a named union is actually exactly equivalent to a named group containing an unnamed
union.

**Wait, weren't groups considered a misfeature in Protobufs?  Why did you do this again?**

They are useful in unions, which Protobufs did not have.  Meanwhile, you cannot have a "repeated
group" in Cap'n Proto, which was the case that got into the most trouble with Protobufs.
280

Kenton Varda's avatar
Kenton Varda committed
281 282
### Dynamically-typed Fields

283 284
A struct may have a field with type `AnyPointer`.  This field's value can be of any pointer type --
i.e. any struct, interface, list, or blob.  This is essentially like a `void*` in C.
Kenton Varda's avatar
Kenton Varda committed
285

Kenton Varda's avatar
Kenton Varda committed
286 287 288 289 290 291 292 293 294 295 296 297 298 299
### Enums

An enum is a type with a small finite set of symbolic values.

{% highlight capnp %}
enum Rfc3092Variable {
  foo @0;
  bar @1;
  baz @2;
  qux @3;
  # ...
}
{% endhighlight %}

300
Like fields, enumerants must be numbered sequentially starting from zero. In languages where
Kenton Varda's avatar
Kenton Varda committed
301 302 303 304 305
enums have numeric values, these numbers will be used, but in general Cap'n Proto enums should not
be considered numeric.

### Interfaces

Kenton Varda's avatar
Kenton Varda committed
306 307 308
An interface has a collection of methods, each of which takes some parameters and return some
results.  Like struct fields, methods are numbered.  Interfaces support inheritance, including
multiple inheritance.
Kenton Varda's avatar
Kenton Varda committed
309 310

{% highlight capnp %}
Kenton Varda's avatar
Kenton Varda committed
311 312
interface Node {
  isDirectory @0 () -> (result :Bool);
Kenton Varda's avatar
Kenton Varda committed
313 314
}

Kenton Varda's avatar
Kenton Varda committed
315 316 317 318 319 320 321 322 323 324 325 326
interface Directory extends(Node) {
  list @0 () -> (list: List(Entry));
  struct Entry {
    name @0 :Text;
    node @1 :Node;
  }

  create @1 (name :Text) -> (file :File);
  mkdir @2 (name :Text) -> (directory :Directory)
  open @3 (name :Text) -> (node :Node);
  delete @4 (name :Text);
  link @5 (name :Text, node :Node);
Kenton Varda's avatar
Kenton Varda committed
327 328
}

Kenton Varda's avatar
Kenton Varda committed
329 330 331 332
interface File extends(Node) {
  size @0 () -> (size: UInt64);
  read @1 (startAt :UInt64 = 0, amount :UInt64 = 0xffffffffffffffff)
       -> (data: Data);
Kenton Varda's avatar
Kenton Varda committed
333 334
  # Default params = read entire file.

Kenton Varda's avatar
Kenton Varda committed
335 336
  write @2 (startAt :UInt64, data :Data);
  truncate @3 (size :UInt64);
Kenton Varda's avatar
Kenton Varda committed
337 338 339
}
{% endhighlight %}

Kenton Varda's avatar
Kenton Varda committed
340 341 342 343 344 345
Notice something interesting here: `Node`, `Directory`, and `File` are interfaces, but several
methods take these types as parameters or return them as results.  `Directory.Entry` is a struct,
but it contains a `Node`, which is an interface.  Structs (and primitive types) are passed over RPC
by value, but interfaces are passed by reference. So when `Directory.list` is called remotely, the
content of a `List(Entry)` (including the text of each `name`) is transmitted back, but for the
`node` field, only a reference to some remote `Node` object is sent.
Kenton Varda's avatar
Kenton Varda committed
346 347 348 349 350 351 352 353 354 355

When an address of an object is transmitted, the RPC system automatically manages making sure that
the recipient gets permission to call the addressed object -- because if the recipient wasn't
meant to have access, the sender shouldn't have sent the reference in the first place. This makes
it very easy to develop secure protocols with Cap'n Proto -- you almost don't need to think about
access control at all. This feature is what makes Cap'n Proto a "capability-based" RPC system -- a
reference to an object inherently represents a "capability" to access it.

### Constants

Kenton Varda's avatar
Kenton Varda committed
356
You can define constants in Cap'n Proto.  These don't affect what is sent on the wire, but they
357
will be included in the generated code, and can be [evaluated using the `capnp`
358
tool](capnp-tool.html#evaluating_constants).
Kenton Varda's avatar
Kenton Varda committed
359 360 361 362 363 364

{% highlight capnp %}
const pi :Float32 = 3.14159;
const bob :Person = (name = "Bob", email = "bob@example.com");
{% endhighlight %}

365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380
Additionally, you may refer to a constant inside another value (e.g. another constant, or a default
value of a field).

{% highlight capnp %}
const foo :Int32 = 123;
const bar :Text = "Hello";
const baz :SomeStruct = (id = .foo, message = .bar);
{% endhighlight %}

Note that when substituting a constant into another value, the constant's name must be qualified
with its scope.  E.g. if a constant `qux` is declared nested in a type `Corge`, it would need to
be referenced as `Corge.qux` rather than just `qux`, even when used within the `Corge` scope.
Constants declared at the top-level scope are prefixed just with `.`.  This rule helps to make it
clear that the name refers to a user-defined constant, rather than a literal value (like `true` or
`inf`) or an enum value.

Kenton Varda's avatar
Kenton Varda committed
381 382 383 384 385 386 387 388 389 390 391 392
### Nesting, Scope, and Aliases

You can nest constant, alias, and type definitions inside structs and interfaces (but not enums).
This has no effect on any definition involved except to define the scope of its name. So in Java
terms, inner classes are always "static". To name a nested type from another scope, separate the
path with `.`s.

{% highlight capnp %}
struct Foo {
  struct Bar {
    #...
  }
Kenton Varda's avatar
Kenton Varda committed
393
  bar @0 :Bar;
Kenton Varda's avatar
Kenton Varda committed
394 395 396
}

struct Baz {
Kenton Varda's avatar
Kenton Varda committed
397
  bar @0 :Foo.Bar;
Kenton Varda's avatar
Kenton Varda committed
398 399 400 401 402 403 404 405
}
{% endhighlight %}

If typing long scopes becomes cumbersome, you can use `using` to declare an alias.

{% highlight capnp %}
struct Qux {
  using Foo.Bar;
Kenton Varda's avatar
Kenton Varda committed
406
  bar @0 :Bar;
Kenton Varda's avatar
Kenton Varda committed
407 408 409 410
}

struct Corge {
  using T = Foo.Bar;
Kenton Varda's avatar
Kenton Varda committed
411
  bar @0 :T;
Kenton Varda's avatar
Kenton Varda committed
412 413 414 415 416 417 418 419 420 421
}
{% endhighlight %}

### Imports

An `import` expression names the scope of some other file:

{% highlight capnp %}
struct Foo {
  # Use type "Baz" defined in bar.capnp.
Kenton Varda's avatar
Kenton Varda committed
422
  baz @0 :import "bar.capnp".Baz;
Kenton Varda's avatar
Kenton Varda committed
423 424 425 426 427 428 429 430 431 432
}
{% endhighlight %}

Of course, typically it's more readable to define an alias:

{% highlight capnp %}
using Bar = import "bar.capnp";

struct Foo {
  # Use type "Baz" defined in bar.capnp.
Kenton Varda's avatar
Kenton Varda committed
433
  baz @0 :Bar.Baz;
Kenton Varda's avatar
Kenton Varda committed
434 435 436 437 438 439 440 441 442 443 444 445 446
}
{% endhighlight %}

Or even:

{% highlight capnp %}
using import "bar.capnp".Baz;

struct Foo {
  baz @0 :Baz;
}
{% endhighlight %}

447
The above imports specify relative paths.  If the path begins with a `/`, it is absolute -- in
448 449
this case, the `capnp` tool searches for the file in each of the search path directories specified
with `-I`.
450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483

### Annotations

Sometimes you want to attach extra information to parts of your protocol that isn't part of the
Cap'n Proto language.  This information might control details of a particular code generator, or
you might even read it at run time to assist in some kind of dynamic message processing.  For
example, you might create a field annotation which means "hide from the public", and when you send
a message to an external user, you might invoke some code first that iterates over your message and
removes all of these hidden fields.

You may declare annotations and use them like so:

{% highlight capnp %}
# Declare an annotation 'foo' which applies to struct and enum types.
annotation foo(struct, enum) :Text;

# Apply 'foo' to to MyType.
struct MyType $foo("bar") {
  # ...
}
{% endhighlight %}

The possible targets for an annotation are: `file`, `struct`, `field`, `union`, `enum`, `enumerant`,
`interface`, `method`, `parameter`, `annotation`, `const`.  You may also specify `*` to cover them
all.

{% highlight capnp %}
# 'baz' can annotate anything!
annotation baz(*) :Int32;

$baz(1);  # Annotate the file.

struct MyStruct $baz(2) {
  myField @0 :Text = "default" $baz(3);
484
  myUnion :union $baz(4) {
485 486 487 488 489 490 491 492 493
    # ...
  }
}

enum MyEnum $baz(5) {
  myEnumerant @0 $baz(6);
}

interface MyInterface $baz(7) {
494
  myMethod @0 (myParam :Text $baz(9)) -> () $baz(8);
495 496 497 498 499 500
}

annotation myAnnotation(struct) :Int32 $baz(10);
const myConst :Int32 = 123 $baz(11);
{% endhighlight %}

501 502 503
`Void` annotations can omit the value.  Struct-typed annotations are also allowed.  Tip:  If
you want an annotation to have a default value, declare it as a struct with a single field with
a default value.
504 505 506 507 508

{% highlight capnp %}
annotation qux(struct, field) :Void;

struct MyStruct $qux {
Kenton Varda's avatar
Kenton Varda committed
509 510
  string @0 :Text $qux;
  number @1 :Int32 $qux;
511 512 513 514 515
}

annotation corge(file) :MyStruct;

$corge(string = "hello", number = 123);
516 517 518 519 520 521 522 523 524

struct Grault {
  value @0 :Int32 = 123;
}

annotation grault(file) :Grault;

$grault();  # value defaults to 123
$grault(value = 456);
525 526
{% endhighlight %}

527 528 529
### Unique IDs

A Cap'n Proto file must have a unique 64-bit ID, and each type and annotation defined therein may
530
also have an ID.  Use `capnp id` to generate a new ID randomly.  ID specifications begin with `@`:
531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551

{% highlight capnp %}
# file ID
@0xdbb9ad1f14bf0b36;

struct Foo @0x8db435604d0d3723 {
  # ...
}

enum Bar @0xb400f69b5334aab3 {
  # ...
}

interface Baz @0xf7141baba3c12691 {
  # ...
}

annotation qux @0xf8a1bedf44c89f00 (field) :Text;
{% endhighlight %}

If you omit the ID for a type or annotation, one will be assigned automatically.  This default
552
ID is derived by taking the first 8 bytes of the MD5 hash of the parent scope's ID concatenated
553
with the declaration's name (where the "parent scope" is the file for top-level delarations, or
554 555 556 557 558
the outer type for nested declarations).  You can see the automatically-generated IDs by "compiling"
your file with the `-ocapnp` flag, which echos the schema back to the terminal annotated with
extra information, e.g. `capnp compile -ocapnp myschema.capnp`.  In general, you would only specify
an explicit ID for a declaration if that declaration has been renamed or moved and you want the ID
to stay the same for backwards-compatibility.
559 560 561 562 563 564 565

IDs exist to provide a relatively short yet unambiguous way to refer to a type or annotation from
another context.  They may be used for representing schemas, for tagging dynamically-typed fields,
etc.  Most languages prefer instead to define a symbolic global namespace e.g. full of "packages",
but this would have some important disadvantages in the context of Cap'n Proto:

* Programmers often feel the need to change symbolic names and organization in order to make their
566
  code cleaner, but the renamed code should still work with existing encoded data.
567 568 569 570
* It's easy for symbolic names to collide, and these collisions could be hard to detect in a large
  distributed system with many different binaries using different versions of protocols.
* Fully-qualified type names may be large and waste space when transmitted on the wire.

571 572 573 574 575
Note that IDs are 64-bit (actually, 63-bit, as the first bit is always 1).  Random collisions
are possible, but unlikely -- there would have to be on the order of a billion types before this
becomes a real concern.  Collisions from misuse (e.g. copying an example without changing the ID)
are much more likely.

Kenton Varda's avatar
Kenton Varda committed
576 577 578 579 580 581
## Evolving Your Protocol

A protocol can be changed in the following ways without breaking backwards-compatibility:

* New types, constants, and aliases can be added anywhere, since they obviously don't affect the
  encoding of any existing type.
Kenton Varda's avatar
Kenton Varda committed
582 583
* New fields, enumerants, and methods may be added to structs, enums, and interfaces, respectively,
  as long as each new member's number is larger than all previous members.  Similarly, new fields
Kenton Varda's avatar
Kenton Varda committed
584
  may be added to existing groups and unions.
Kenton Varda's avatar
Kenton Varda committed
585 586
* New parameters may be added to a method.  The new parameters must be added to the end of the
  parameter list and must have default values.
Kenton Varda's avatar
Kenton Varda committed
587 588 589 590 591 592 593
* Members can be re-arranged in the source code, so long as their numbers stay the same.
* Any symbolic name can be changed, as long as the type ID / ordinal numbers stay the same.  Note
  that type declarations have an implicit ID generated based on their name and parent's ID, but
  you can use `capnp compile -ocapnp myschema.capnp` to find out what that number is, and then
  declare it explicitly after your rename.
* Types definitions can be moved to different scopes, as long as the type ID is declared
  explicitly.
Kenton Varda's avatar
Kenton Varda committed
594 595 596 597
* A field of type `List(T)`, where `T` is a primitive type, blob, or list, may be changed to type
  `List(U)`, where `U` is a struct type whose `@0` field is of type `T`.  This rule is useful when
  you realize too late that you need to attach some extra data to each element of your list.
  Without this rule, you would be stuck defining parallel lists, which are ugly and error-prone.
Kenton Varda's avatar
Kenton Varda committed
598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616
* A field can be moved into a group or a union, as long as the group/union and all other fields
  within it are new.  In other words, a field can be replaced with a group or union containing an
  equivalent field and some new fields.

Any other change should be assumed NOT to be safe.  In particular:

* You cannot change a field, method, or enumerant's number.
* You cannot change a field or method parameter's type or default value, except as described above.
* You cannot change a type's ID.
* You cannot change the name of a type that doesn't have an explicit ID, as the implicit ID is
  generated based in part on the type name.
* You cannot move a type to a different scope or file unless it has an explicit ID, as the implicit
  ID is based in part on the scope's ID.
* You cannot move an existing field into or out of an existing union, nor can you form a new union
  containing more than one existing field.

Also, these rules only apply to the Cap'n Proto native encoding.  It is sometimes useful to
transcode Cap'n Proto types to other formats, like JSON, which may have different rules (e.g.,
field names cannot change in JSON).