Update FAQ

27e8f781 · miloyip · 1a043e62 · 27e8f781
Commit 27e8f781 authored Mar 25, 2015 by miloyip
Hide whitespace changes
Inline Side-by-side

Showing with 90 additions and 7 deletions

faq.md doc/faq.md +90 -7

No files found.
--- a/doc/faq.md
+++ b/doc/faq.md
@@ -42,11 +42,11 @@

 10. How RapidJSON is tested?

-   RapidJSON contains a unit test suite for automatic testing. [Travis](https://travis-ci.org/miloyip/rapidjson/) will compile and run the unit test suite for all modifications. The test process also uses Valgrind to detect memory leaks.
+   RapidJSON contains a unit test suite for automatic testing. [Travis](https://travis-ci.org/miloyip/rapidjson/)(for Linux) and [AppVeyor](https://ci.appveyor.com/project/miloyip/rapidjson/)(for Windows) will compile and run the unit test suite for all modifications. The test process also uses Valgrind (in Linux) to detect memory leaks.

 11. Is RapidJSON well documented?

-   RapidJSON provides [user guide and API documentationn](http://miloyip.github.io/rapidjson/index.html).
+   RapidJSON provides user guide and API documentationn.

 12. Are there alternatives?

@@ -86,11 +86,11 @@

 4. What is *in situ* parsing?

-   *in situ* parsing decodes the JSON strings directly into the input JSON. This is an optimization which can reduce memory consumption and improve performance, but the input JSON will be modified. Check [in-situ parsing](http://miloyip.github.io/rapidjson/md_doc_dom.html#InSituParsing) for details.
+   *in situ* parsing decodes the JSON strings directly into the input JSON. This is an optimization which can reduce memory consumption and improve performance, but the input JSON will be modified. Check [in-situ parsing](doc/dom.md) for details.

 5. When does parsing generate an error?

-   The parser generates an error when the input JSON contains invalid syntax, or a value can not be represented (a number is too big), or the handler of parsers terminate the parsing. Check [parse error](http://miloyip.github.io/rapidjson/md_doc_dom.html#ParseError) for details.
+   The parser generates an error when the input JSON contains invalid syntax, or a value can not be represented (a number is too big), or the handler of parsers terminate the parsing. Check [parse error](doc/dom.md) for details.

 6. What error information is provided? 

@@ -103,44 +103,127 @@
 ## Document/Value (DOM)

 1. What is move semantics? Why?
+
+   Instead of copy semantics, move semantics is used in `Value`. That means, when assigning a source value to a target value, the ownership of source value is moved to the target value.
+
+   Since moving is faster than copying, this design decision force user to aware of the copying overhead.
+
 2. How to copy a value?
+
+   There are two APIs: constructor with allocator, and `CopyFrom()`. See [Deep Copy Value](doc/tutorial.md) for an example.
+
 3. Why do I need to provide the length of string?
+
+   Since C string is null-terminated, the length of string needed to be computed via `strlen()`, with linear runtime complexity. This incurs an unncessary overhead of many operations, if the user alread know the length of string.
+
+   Also, RapidJSON can handle `\u0000` (null character) within a string. If a string contains null characters, `strlen()` cannot return the true length of it. In such case user must provide the length of string explicitly.
+
 4. Why do I need to provide allocator in many DOM manipulation API?
+
+   Since the APIs are member functions of `Value`, we do not want to save an allocator pointer in every `Value`.
+
 5. Does it convert between numerical types?

+   When using `GetInt()`, `GetUint()`, ... conversion may occur. For integer-to-integer conversion, it only convert when it is safe (otherwise it will assert). However, when converting a 64-bit siggned/unsigned integer to double, it will convert but be aware that it may lose precision. A number with fraction, or an integer larger than 64-bit, can only be obtained by `GetDouble()`.
+
 ## Reader/Writer (SAX)

-1. Why not just `printf` a JSON? Why need a `Writer`? 
-2. Why can't I parse a JSON which is just a number?
-3. Can I pause the parsing process and resume it later?
+1. Why don't we just `printf` a JSON? Why do we need a `Writer`? 
+
+   Most importantly, `Writer` will ensure the output JSON is well-formed. Calling SAX events incorrectly (e.g. `StartObject()` pairing with `EndArray()`) will assert. Besides, `Writer` will escapes strings (e.g., `\n`). Finally, the numeric output of `printf()` may not be a valid JSON number, especially in some locale with digit delimiters. And the number-to-string conversion in `Writer` is implemented with very fast algorithms, which outperforms than `printf()` or `iostream`.
+
+2. Can I pause the parsing process and resume it later?
+
+   This is not directly supported in the current version due to performance consideration. However, if the execution environment supports multi-threading, user can parse a JSON in a separate thread, and pause it by blocking in the input stream.

 ## Unicode

 1. Does it support UTF-8, UTF-16 and other format?
+
+   Yes. It fully support UTF-8, UTF-16 (LE/BE), UTF-32 (LE/BE) and ASCII. 
+
 2. Can it validate the encoding?
+
+   Yes, just pass `kParseValidateEncodingFlag` to `Parse()`. If there is invalid encoding in the stream, it wil generate `kParseErrorStringInvalidEncoding` error.
+
 3. What is surrogate pair? Does RapidJSON support it?
+
+   JSON uses UTF-16 encoding when escaping unicode character, e.g. `\u5927` representing Chinese character "big". To handle characters other than those in basic multilingual plane (BMP), UTF-16 encodes those characters with two 16-bit values, which is called UTF-16 surrogate pair. For example, the Emoji character U+1F602 can be encoded as `\uD83D\uDE02` in JSON.
+
+   RapidJSON fully support parsing/generating UTF-16 surrogates. 
+
 4. Can it handle '\u0000' (null character) in JSON string?
+
+   Yes. RapidJSON fully support null character in JSON string. However, user need to be awared of it and using `GetStringLength()` and related APIs to obtain the true length of string.
+
 5. Can I output '\uxxxx' for all non-ASCII character?

+   Yes, use `ASCII<>` as output encoding template parameter in `Writer` can enforce escaping those characters.
+
 ## Stream

 1. I have a big JSON file. Should I load the whole file to memory?
+
+   User can use `FileReadStream` to read the file chunk-by-chunk. But for *in situ* parsing, the whole file must be loaded.
+
 2. Can I parse JSON while it is streamed from network?
+
+   Yes. User can implement a custom stream for this. Please refer to the implementation of `FileReadStream`.
+
 3. I don't know what format will the JSON be. How to handle them?
+
+   You may use `AutoUTFInputStream` which detects the encoding of input stream automatically. However, it will incur some performance overhead.
+
 4. What is BOM? How RapidJSON handle it?
+
+   [Byte order mark (BOM)](http://en.wikipedia.org/wiki/Byte_order_mark) are sometimes reside at the beginning of file/stream to indiciate the UTF encoding type of it.
+
+   RapidJSON's `EncodedInputStream` can detect/consume BOM. `EncodedOutputStream` can optionally write a BOM. See [Encoded Streams](doc/stream.md) for example.
+
 5. Why little/big endian is related?

+   little/big endian of stream is an issue for UTF-16 and UTF-32 streams, but not UTF-8 stream.
+
 ## Performance

 1. Is RapidJSON really fast?
+
+   Yes. It may be the fastest open source JSON library. There is a [benchmark](https://github.com/miloyip/nativejson-benchmark) for evaluating performance of C/C++ JSON libaries.
+
 2. Why is it fast?
+
+   Many design decisions of RapidJSON is aimed at time/space performance. This somehow reduces user-friendliness of APIs. Besides, it also employs low-level optimizations (intrinsics, SIMD) and custom algorithms (custom double-to-string, string-to-double conversions).
+
 3. What is SIMD? How it is applied in RapidJSON?
+
+   [SIMD](http://en.wikipedia.org/wiki/SIMD) instructions can perform parallel computation in modern CPUs. RapidJSON support Intel's SSE2/SSE4.1 to accelerate whitespace skipping. This improves performance of parsing indent formatted JSON.
+
 4. Does it consume a lot of memory?
+
+   The design of RapidJSON aims at reducing memory footprint.
+
+   In the SAX API, `Reader` consumes memory portional to maximum depth of JSON tree, plus maximum length of JSON string.
+
+   In the DOM API, each `Value` consumes exactly 16/24 bytes for 32/64-bit architecture respectively. RapidJSON also uses a special memory allocator to minimize overhead of allocations.
+
 5. What is the purpose of being high performance?

+   Some applications need to process very large JSON files. Some server-side applications need to process huge amount of JSONs. Being high performance can improve both latency and throuput. In a broad sense, it will also save energy.
+
 ## Gossip

 1. Who are the developers of RapidJSON?
+
+   Milo Yip (@miloyip) is the original author of RapidJSON. Many contributors from the world have improved RapidJSON.  Philipp A. Hartmann (@pah) has implemented a lot of improvements, setting up automatic testing and also involves in a lot of discussions for the community. Don Ding (@thebusytypist) implemented the iterative parser. Andrii Senkovych (@jollyroger) completed the CMake migration. Kosta (@Kosta-Github) provided a very neat short-string optimization. Thank you for all other contributors and community members as well.
+
 2. Why do you develop RapidJSON?
+
+   It was just a hobby project initially in 2011. Milo Yip is a game programmer and he just knew about JSON at that time and would like to apply JSON in future projects. As JSON seems very simple he would like to write a header-only and fast library.
+
 3. Why there is a long empty period of development?
+
+   It is basically due to personal issues, such as getting new family members. Also, Milo Yip has spent a lot of spare time on translating "Game Engine Architecture" by Jason Gregory into Chinese.
+
 4. Why did the repository move from Google Code to GitHub?
+
+   This is the trend. And GitHub is much more powerful and convenient.