Commit 9b63f39c authored by Milo Yip's avatar Milo Yip

Merge pull request #280 from miloyip/zh-cn

Localize documents to simplified Chinese
parents f5b021d7 d1a345a6
This source diff could not be displayed because it is too large. You can view the blob instead.
digraph {
compound=true
fontname="Inconsolata, Consolas"
fontsize=10
margin="0,0"
ranksep=0.2
nodesep=0.5
penwidth=0.5
colorscheme=spectral7
node [shape=box, fontname="Inconsolata, Consolas", fontsize=10, penwidth=0.5, style=filled, fillcolor=white]
edge [fontname="Inconsolata, Consolas", fontsize=10, penwidth=0.5]
subgraph cluster1 {
margin="10,10"
labeljust="left"
label = "SAX"
style=filled
fillcolor=6
Reader -> Writer [style=invis]
}
subgraph cluster2 {
margin="10,10"
labeljust="left"
label = "DOM"
style=filled
fillcolor=7
Value
Document
}
Handler [label="<<concept>>\nHandler"]
{
edge [arrowtail=onormal, dir=back]
Value -> Document
Handler -> Document
Handler -> Writer
}
{
edge [arrowhead=vee, style=dashed, constraint=false]
Reader -> Handler [label="calls"]
Value -> Handler [label="calls"]
Document -> Reader [label="uses"]
}
}
\ No newline at end of file
digraph {
rankdir=LR
compound=true
fontname="Inconsolata, Consolas"
fontsize=10
margin="0,0"
ranksep=0.3
nodesep=0.15
penwidth=0.5
colorscheme=spectral7
node [shape=box, fontname="Inconsolata, Consolas", fontsize=10, penwidth=0.5, style=filled, fillcolor=white]
edge [fontname="Inconsolata, Consolas", fontsize=10, penwidth=0.5]
subgraph cluster0 {
style=filled
fillcolor=4
Encoding [label="<<concept>>\nEncoding"]
edge [arrowtail=onormal, dir=back]
Encoding -> { UTF8; UTF16; UTF32; ASCII; AutoUTF }
UTF16 -> { UTF16LE; UTF16BE }
UTF32 -> { UTF32LE; UTF32BE }
}
subgraph cluster1 {
style=filled
fillcolor=5
Stream [label="<<concept>>\nStream"]
InputByteStream [label="<<concept>>\nInputByteStream"]
OutputByteStream [label="<<concept>>\nOutputByteStream"]
edge [arrowtail=onormal, dir=back]
Stream -> {
StringStream; InsituStringStream; StringBuffer;
EncodedInputStream; EncodedOutputStream;
AutoUTFInputStream; AutoUTFOutputStream
InputByteStream; OutputByteStream
}
InputByteStream -> { MemoryStream; FlieReadStream }
OutputByteStream -> { MemoryBuffer; FileWriteStream }
}
subgraph cluster2 {
style=filled
fillcolor=3
Allocator [label="<<concept>>\nAllocator"]
edge [arrowtail=onormal, dir=back]
Allocator -> { CrtAllocator; MemoryPoolAllocator }
}
{
edge [arrowtail=odiamond, arrowhead=vee, dir=both]
EncodedInputStream -> InputByteStream
EncodedOutputStream -> OutputByteStream
AutoUTFInputStream -> InputByteStream
AutoUTFOutputStream -> OutputByteStream
MemoryPoolAllocator -> Allocator [label="base", tailport=s]
}
{
edge [arrowhead=vee, style=dashed]
AutoUTFInputStream -> AutoUTF
AutoUTFOutputStream -> AutoUTF
}
//UTF32LE -> Stream [style=invis]
}
\ No newline at end of file
This diff is collapsed.
# 编码
根据[ECMA-404](http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf)
> (in Introduction) JSON text is a sequence of Unicode code points.
>
> 翻译:JSON文本是Unicode码点的序列。
较早的[RFC4627](http://www.ietf.org/rfc/rfc4627.txt)申明:
> (in §3) JSON text SHALL be encoded in Unicode. The default encoding is UTF-8.
>
> 翻译:JSON文本应该以Unicode编码。缺省的编码为UTF-8。
> (in §6) JSON may be represented using UTF-8, UTF-16, or UTF-32. When JSON is written in UTF-8, JSON is 8bit compatible. When JSON is written in UTF-16 or UTF-32, the binary content-transfer-encoding must be used.
>
> 翻译:JSON可使用UTF-8、UTF-16或UTF-18表示。当JSON以UTF-8写入,该JSON是8位兼容的。当JSON以UTF-16或UTF-32写入,就必须使用二进制的内容传送编码。
RapidJSON支持多种编码。它也能检查JSON的编码,以及在不同编码中进行转码。所有这些功能都是在内部实现,无需使用外部的程序库(如[ICU](http://site.icu-project.org/))。
[TOC]
# Unicode {#Unicode}
根据 [Unicode的官方网站](http://www.unicode.org/standard/translations/t-chinese.html)
>Unicode给每个字符提供了一个唯一的数字,
不论是什么平台、
不论是什么程序、
不论是什么语言。
这些唯一数字称为码点(code point),其范围介乎`0x0``0x10FFFF`之间。
## Unicode转换格式 {#UTF}
存储Unicode码点有多种编码方式。这些称为Unicode转换格式(Unicode Transformation Format, UTF)。RapidJSON支持最常用的UTF,包括:
* UTF-8:8位可变长度编码。它把一个码点映射至1至4个字节。
* UTF-16:16位可变长度编码。它把一个码点映射至1至2个16位编码单元(即2至4个字节)。
* UTF-32:32位固定长度编码。它直接把码点映射至单个32位编码单元(即4字节)。
对于UTF-16及UTF-32来说,字节序(endianness)是有影响的。在内存中,它们通常都是以该计算机的字节序来存储。然而,当要储存在文件中或在网上传输,我们需要指明字节序列的字节序,是小端(little endian, LE)还是大端(big-endian, BE)。
RapidJSON通过`rapidjson/encodings.h`中的struct去提供各种编码:
~~~~~~~~~~cpp
namespace rapidjson {
template<typename CharType = char>
struct UTF8;
template<typename CharType = wchar_t>
struct UTF16;
template<typename CharType = wchar_t>
struct UTF16LE;
template<typename CharType = wchar_t>
struct UTF16BE;
template<typename CharType = unsigned>
struct UTF32;
template<typename CharType = unsigned>
struct UTF32LE;
template<typename CharType = unsigned>
struct UTF32BE;
} // namespace rapidjson
~~~~~~~~~~
对于在内存中的文本,我们正常会使用`UTF8``UTF16``UTF32`。对于处理经过I/O的文本,我们可使用`UTF8``UTF16LE``UTF16BE``UTF32LE``UTF32BE`
当使用DOM风格的API,`GenericValue<Encoding>``GenericDocument<Encoding>`里的`Encoding`模板参数是用于指明内存中存储的JSON字符串使用哪种编码。因此通常我们会在此参数中使用`UTF8``UTF16``UTF32`。如何选择,视乎应用软件所使用的操作系统及其他程序库。例如,Windows API使用UTF-16表示Unicode字符,而多数的Linux发行版本及应用软件则更喜欢UTF-8。
使用UTF-16的DOM声明例子:
~~~~~~~~~~cpp
typedef GenericDocument<UTF16<> > WDocument;
typedef GenericValue<UTF16<> > WValue;
~~~~~~~~~~
可以在[DOM's Encoding](doc/stream.md#Encoding)一节看到更详细的使用例子。
## 字符类型 {#CharacterType}
从之前的声明中可以看到,每个编码都有一个`CharType`模板参数。这可能比较容易混淆,实际上,每个`CharType`存储一个编码单元,而不是一个字符(码点)。如之前所谈及,在UTF-8中一个码点可能会编码成1至4个编码单元。
对于`UTF16(LE|BE)``UTF32(LE|BE)`来说,`CharType`必须分别是一个至少2及4字节的整数类型。
注意C++11新添了`char16_t``char32_t`类型,也可分别用于`UTF16``UTF32`
## AutoUTF {#AutoUTF}
上述所介绍的编码都是在编译期静态挷定的。换句话说,使用者必须知道内存或流之中使用了哪种编码。然而,有时候我们可能需要读写不同编码的文件,而且这些编码需要在运行时才能决定。
`AutoUTF`是为此而设计的编码。它根据输入或输出流来选择使用哪种编码。目前它应该与`EncodedInputStream``EncodedOutputStream`结合使用。
## ASCII {#ASCII}
虽然JSON标准并未提及[ASCII](http://en.wikipedia.org/wiki/ASCII),有时候我们希望写入7位的ASCII JSON,以供未能处理UTF-8的应用程序使用。由于任JSON都可以把Unicode字符表示为`\uXXXX`转义序列,JSON总是可用ASCII来编码。
以下的例子把UTF-8的DOM写成ASCII的JSON:
~~~~~~~~~~cpp
using namespace rapidjson;
Document d; // UTF8<>
// ...
StringBuffer buffer;
Writer<StringBuffer, Document::EncodingType, ASCII<> > writer(buffer);
d.Accept(writer);
std::cout << buffer.GetString();
~~~~~~~~~~
ASCII可用于输入流。当输入流包含大于127的字节,就会导致`kParseErrorStringInvalidEncoding`错误。
ASCII *不能* 用于内存(`Document`的编码,或`Reader`的目标编码),因为它不能表示Unicode码点。
# 校验及转码 {#ValidationTranscoding}
当RapidJSON解析一个JSON时,它能校验输入JSON,判断它是否所标明编码的合法序列。要开启此选项,请把`kParseValidateEncodingFlag`加入`parseFlags`模板参数。
若输入编码和输出编码并不相同,`Reader``Writer`会算把文本转码。在这种情况下,并不需要`kParseValidateEncodingFlag`,因为它必须解码输入序列。若序列不能被解码,它必然是不合法的。
## 转码器 {#Transcoder}
虽然RapidJSON的编码功能是为JSON解析/生成而设计,使用者也可以“滥用”它们来为非JSON字符串转码。
以下的例子把UTF-8字符串转码成UTF-16:
~~~~~~~~~~cpp
#include "rapidjson/encodings.h"
using namespace rapidjson;
const char* s = "..."; // UTF-8 string
StringStream source(s);
GenericStringBuffer<UTF16<> > target;
bool hasError = false;
while (source.Peak() != '\0')
if (!Transcoder::Transcode<UTF8<>, UTF16<> >(source, target)) {
hasError = true;
break;
}
if (!hasError) {
const wchar_t* t = target.GetString();
// ...
}
~~~~~~~~~~
你也可以用`AutoUTF`及对应的流来在运行时设置内源/目的之编码。
...@@ -42,11 +42,11 @@ ...@@ -42,11 +42,11 @@
10. How RapidJSON is tested? 10. How RapidJSON is tested?
RapidJSON contains a unit test suite for automatic testing. [Travis](https://travis-ci.org/miloyip/rapidjson/) will compile and run the unit test suite for all modifications. The test process also uses Valgrind to detect memory leaks. RapidJSON contains a unit test suite for automatic testing. [Travis](https://travis-ci.org/miloyip/rapidjson/)(for Linux) and [AppVeyor](https://ci.appveyor.com/project/miloyip/rapidjson/)(for Windows) will compile and run the unit test suite for all modifications. The test process also uses Valgrind (in Linux) to detect memory leaks.
11. Is RapidJSON well documented? 11. Is RapidJSON well documented?
RapidJSON provides [user guide and API documentationn](http://miloyip.github.io/rapidjson/index.html). RapidJSON provides user guide and API documentationn.
12. Are there alternatives? 12. Are there alternatives?
...@@ -86,11 +86,11 @@ ...@@ -86,11 +86,11 @@
4. What is *in situ* parsing? 4. What is *in situ* parsing?
*in situ* parsing decodes the JSON strings directly into the input JSON. This is an optimization which can reduce memory consumption and improve performance, but the input JSON will be modified. Check [in-situ parsing](http://miloyip.github.io/rapidjson/md_doc_dom.html#InSituParsing) for details. *in situ* parsing decodes the JSON strings directly into the input JSON. This is an optimization which can reduce memory consumption and improve performance, but the input JSON will be modified. Check [in-situ parsing](doc/dom.md) for details.
5. When does parsing generate an error? 5. When does parsing generate an error?
The parser generates an error when the input JSON contains invalid syntax, or a value can not be represented (a number is too big), or the handler of parsers terminate the parsing. Check [parse error](http://miloyip.github.io/rapidjson/md_doc_dom.html#ParseError) for details. The parser generates an error when the input JSON contains invalid syntax, or a value can not be represented (a number is too big), or the handler of parsers terminate the parsing. Check [parse error](doc/dom.md) for details.
6. What error information is provided? 6. What error information is provided?
...@@ -103,44 +103,127 @@ ...@@ -103,44 +103,127 @@
## Document/Value (DOM) ## Document/Value (DOM)
1. What is move semantics? Why? 1. What is move semantics? Why?
Instead of copy semantics, move semantics is used in `Value`. That means, when assigning a source value to a target value, the ownership of source value is moved to the target value.
Since moving is faster than copying, this design decision forces user to aware of the copying overhead.
2. How to copy a value? 2. How to copy a value?
There are two APIs: constructor with allocator, and `CopyFrom()`. See [Deep Copy Value](doc/tutorial.md) for an example.
3. Why do I need to provide the length of string? 3. Why do I need to provide the length of string?
4. Why do I need to provide allocator in many DOM manipulation API?
Since C string is null-terminated, the length of string needs to be computed via `strlen()`, with linear runtime complexity. This incurs an unncessary overhead of many operations, if the user already knows the length of string.
Also, RapidJSON can handle `\u0000` (null character) within a string. If a string contains null characters, `strlen()` cannot return the true length of it. In such case user must provide the length of string explicitly.
4. Why do I need to provide allocator parameter in many DOM manipulation API?
Since the APIs are member functions of `Value`, we do not want to save an allocator pointer in every `Value`.
5. Does it convert between numerical types? 5. Does it convert between numerical types?
When using `GetInt()`, `GetUint()`, ... conversion may occur. For integer-to-integer conversion, it only convert when it is safe (otherwise it will assert). However, when converting a 64-bit signed/unsigned integer to double, it will convert but be aware that it may lose precision. A number with fraction, or an integer larger than 64-bit, can only be obtained by `GetDouble()`.
## Reader/Writer (SAX) ## Reader/Writer (SAX)
1. Why not just `printf` a JSON? Why need a `Writer`? 1. Why don't we just `printf` a JSON? Why do we need a `Writer`?
2. Why can't I parse a JSON which is just a number?
3. Can I pause the parsing process and resume it later? Most importantly, `Writer` will ensure the output JSON is well-formed. Calling SAX events incorrectly (e.g. `StartObject()` pairing with `EndArray()`) will assert. Besides, `Writer` will escapes strings (e.g., `\n`). Finally, the numeric output of `printf()` may not be a valid JSON number, especially in some locale with digit delimiters. And the number-to-string conversion in `Writer` is implemented with very fast algorithms, which outperforms than `printf()` or `iostream`.
2. Can I pause the parsing process and resume it later?
This is not directly supported in the current version due to performance consideration. However, if the execution environment supports multi-threading, user can parse a JSON in a separate thread, and pause it by blocking in the input stream.
## Unicode ## Unicode
1. Does it support UTF-8, UTF-16 and other format? 1. Does it support UTF-8, UTF-16 and other format?
Yes. It fully support UTF-8, UTF-16 (LE/BE), UTF-32 (LE/BE) and ASCII.
2. Can it validate the encoding? 2. Can it validate the encoding?
Yes, just pass `kParseValidateEncodingFlag` to `Parse()`. If there is invalid encoding in the stream, it wil generate `kParseErrorStringInvalidEncoding` error.
3. What is surrogate pair? Does RapidJSON support it? 3. What is surrogate pair? Does RapidJSON support it?
4. Can it handle '\u0000' (null character) in JSON string?
5. Can I output '\uxxxx' for all non-ASCII character? JSON uses UTF-16 encoding when escaping unicode character, e.g. `\u5927` representing Chinese character "big". To handle characters other than those in basic multilingual plane (BMP), UTF-16 encodes those characters with two 16-bit values, which is called UTF-16 surrogate pair. For example, the Emoji character U+1F602 can be encoded as `\uD83D\uDE02` in JSON.
RapidJSON fully support parsing/generating UTF-16 surrogates.
4. Can it handle `\u0000` (null character) in JSON string?
Yes. RapidJSON fully support null character in JSON string. However, user need to be aware of it and using `GetStringLength()` and related APIs to obtain the true length of string.
5. Can I output `\uxxxx` for all non-ASCII character?
Yes, use `ASCII<>` as output encoding template parameter in `Writer` can enforce escaping those characters.
## Stream ## Stream
1. I have a big JSON file. Should I load the whole file to memory? 1. I have a big JSON file. Should I load the whole file to memory?
User can use `FileReadStream` to read the file chunk-by-chunk. But for *in situ* parsing, the whole file must be loaded.
2. Can I parse JSON while it is streamed from network? 2. Can I parse JSON while it is streamed from network?
3. I don't know what format will the JSON be. How to handle them?
Yes. User can implement a custom stream for this. Please refer to the implementation of `FileReadStream`.
3. I don't know what encoding will the JSON be. How to handle them?
You may use `AutoUTFInputStream` which detects the encoding of input stream automatically. However, it will incur some performance overhead.
4. What is BOM? How RapidJSON handle it? 4. What is BOM? How RapidJSON handle it?
[Byte order mark (BOM)](http://en.wikipedia.org/wiki/Byte_order_mark) sometimes reside at the beginning of file/stream to indiciate the UTF encoding type of it.
RapidJSON's `EncodedInputStream` can detect/consume BOM. `EncodedOutputStream` can optionally write a BOM. See [Encoded Streams](doc/stream.md) for example.
5. Why little/big endian is related? 5. Why little/big endian is related?
little/big endian of stream is an issue for UTF-16 and UTF-32 streams, but not UTF-8 stream.
## Performance ## Performance
1. Is RapidJSON really fast? 1. Is RapidJSON really fast?
Yes. It may be the fastest open source JSON library. There is a [benchmark](https://github.com/miloyip/nativejson-benchmark) for evaluating performance of C/C++ JSON libaries.
2. Why is it fast? 2. Why is it fast?
Many design decisions of RapidJSON is aimed at time/space performance. These may reduce user-friendliness of APIs. Besides, it also employs low-level optimizations (intrinsics, SIMD) and special algorithms (custom double-to-string, string-to-double conversions).
3. What is SIMD? How it is applied in RapidJSON? 3. What is SIMD? How it is applied in RapidJSON?
[SIMD](http://en.wikipedia.org/wiki/SIMD) instructions can perform parallel computation in modern CPUs. RapidJSON support Intel's SSE2/SSE4.1 to accelerate whitespace skipping. This improves performance of parsing indent formatted JSON.
4. Does it consume a lot of memory? 4. Does it consume a lot of memory?
The design of RapidJSON aims at reducing memory footprint.
In the SAX API, `Reader` consumes memory portional to maximum depth of JSON tree, plus maximum length of JSON string.
In the DOM API, each `Value` consumes exactly 16/24 bytes for 32/64-bit architecture respectively. RapidJSON also uses a special memory allocator to minimize overhead of allocations.
5. What is the purpose of being high performance? 5. What is the purpose of being high performance?
Some applications need to process very large JSON files. Some server-side applications need to process huge amount of JSONs. Being high performance can improve both latency and throuput. In a broad sense, it will also save energy.
## Gossip ## Gossip
1. Who are the developers of RapidJSON? 1. Who are the developers of RapidJSON?
Milo Yip (@miloyip) is the original author of RapidJSON. Many contributors from the world have improved RapidJSON. Philipp A. Hartmann (@pah) has implemented a lot of improvements, setting up automatic testing and also involves in a lot of discussions for the community. Don Ding (@thebusytypist) implemented the iterative parser. Andrii Senkovych (@jollyroger) completed the CMake migration. Kosta (@Kosta-Github) provided a very neat short-string optimization. Thank you for all other contributors and community members as well.
2. Why do you develop RapidJSON? 2. Why do you develop RapidJSON?
It was just a hobby project initially in 2011. Milo Yip is a game programmer and he just knew about JSON at that time and would like to apply JSON in future projects. As JSON seems very simple he would like to write a header-only and fast library.
3. Why there is a long empty period of development? 3. Why there is a long empty period of development?
It is basically due to personal issues, such as getting new family members. Also, Milo Yip has spent a lot of spare time on translating "Game Engine Architecture" by Jason Gregory into Chinese.
4. Why did the repository move from Google Code to GitHub? 4. Why did the repository move from Google Code to GitHub?
This is the trend. And GitHub is much more powerful and convenient.
This diff is collapsed.
# 特点
## 总体
* 跨平台
* 编译器:Visual Studio、gcc、clang等
* 架构:x86、x64、ARM等
* 操作系统:Windows、Mac OS X、Linux、iOS、Android等
* 容易安装
* 只有头文件的库。只需把头文件复制至你的项目中。
* 独立、最小依赖
* 不需依赖STL、BOOST等。
* 只包含`<cstdio>`, `<cstdlib>`, `<cstring>`, `<inttypes.h>`, `<new>`, `<stdint.h>`
* 没使用C++异常、RTTI
* 高性能
* 使用模版及内联函数去降低函数调用开销。
* 内部经优化的Grisu2及浮点数解析实现。
* 可选的SSE2/SSE4.1支持。
## 符合标准
* RapidJSON应完全符合RFC4627/ECMA-404标准。
* 支持Unicod代理对(surrogate pair)。
* 支持空字符(`"\u0000"`)。
* 例如,可以优雅地解析及处理`["Hello\u0000World"]`。含读写字符串长度的API。
## Unicode
* 支持UTF-8、UTF-16、UTF-32编码,包括小端序和大端序。
* 这些编码用于输入输出流,以及内存中的表示。
* 支持从输入流自动检测编码。
* 内部支持编码的转换。
* 例如,你可以读取一个UTF-8文件,让RapidJSON把JSON字符串转换至UTF-16的DOM。
* 内部支持编码校验。
* 例如,你可以读取一个UTF-8文件,让RapidJSON检查是否所有JSON字符串是合法的UTF-8字节序列。
* 支持自定义的字符类型。
* 预设的字符类型是:UTF-8为`char`,UTF-16为`wchar_t`,UTF32为`uint32_t`
* 支持自定义的编码。
## API风格
* SAX(Simple API for XML)风格API
* 类似于[SAX](http://en.wikipedia.org/wiki/Simple_API_for_XML), RapidJSON提供一个事件循序访问的解析器API(`rapidjson::GenericReader`)。RapidJSON也提供一个生成器API(`rapidjson::Writer`),可以处理相同的事件集合。
* DOM(Document Object Model)风格API
* 类似于HTML/XML的[DOM](http://en.wikipedia.org/wiki/Document_Object_Model),RapidJSON可把JSON解析至一个DOM表示方式(`rapidjson::GenericDocument`),以方便操作。如有需要,可把DOM转换(stringify)回JSON。
* DOM风格API(`rapidjson::GenericDocument`)实际上是由SAX风格API(`rapidjson::GenericReader`)实现的。SAX更快,但有时DOM更易用。用户可根据情况作出选择。
## 解析
* 递归式(预设)及迭代式解析器
* 递归式解析器较快,但在极端情况下可出现堆栈溢出。
* 迭代式解析器使用自定义的堆栈去维持解析状态。
* 支持原位(*in situ*)解析。
* 把JSON字符串的值解析至原JSON之中,然后让DOM指向那些字符串。
* 比常规分析更快:不需字符串的内存分配、不需复制(如字符串不含转义符)、缓存友好。
* 对于JSON数字类型,支持32-bit/64-bit的有号/无号整数,以及`double`
* 错误处理
* 支持详尽的解析错误代号。
* 支持本地化错误信息。
## DOM (Document)
* RapidJSON在类型转换时会检查数值的范围。
* 字符串字面量的优化
* 只储存指针,不作复制
* 优化“短”字符串
*`Value`内储存短字符串,无需额外分配。
* 对UTF-8字符串来说,32位架构下可存储最多11字符,64位下15字符。
* 可选地支持`std::string`(定义`RAPIDJSON_HAS_STDSTRING=1`
## 生成
* 支持`rapidjson::PrettyWriter`去加入换行及缩进。
## 输入输出流
* 支持`rapidjson::GenericStringBuffer`,把输出的JSON储存于字符串内。
* 支持`rapidjson::FileReadStream``rapidjson::FileWriteStream`,使用`FILE`对象作输入输出。
* 支持自定义输入输出流。
## 内存
* 最小化DOM的内存开销。
* 对大部分32/64位机器而言,每个JSON值只占16或20字节(不包含字符串)。
* 支持快速的预设分配器。
* 它是一个堆栈形式的分配器(顺序分配,不容许单独释放,适合解析过程之用)。
* 使用者也可提供一个预分配的缓冲区。(有可能达至无需CRT分配就能解析多个JSON)
* 支持标准CRT(C-runtime)分配器。
* 支持自定义分配器。
## 其他
* 一些C++11的支持(可选)
* 右值引用(rvalue reference)
* `noexcept`修饰符
This diff is collapsed.
# Performance # Performance
The old performance article for RapidJSON 0.1 is provided [here](https://code.google.com/p/rapidjson/wiki/Performance). There is a [native JSON benchmark collection] [1] which evaluates speed, memory usage and code size of various operations among 20 JSON libaries.
The (third-party) performance tests have been removed from this repository
and are now part of a dedicated [native JSON benchmark collection] [1].
This file will be updated with a summary of benchmarking results based on
the above benchmark collection in the future.
[1]: https://github.com/miloyip/nativejson-benchmark [1]: https://github.com/miloyip/nativejson-benchmark
The old performance article for RapidJSON 0.1 is provided [here](https://code.google.com/p/rapidjson/wiki/Performance).
Additionally, you may refer to the following third-party benchmarks. Additionally, you may refer to the following third-party benchmarks.
## Third-party benchmarks ## Third-party benchmarks
......
# 性能
有一个[native JSON benchmark collection][1]项目,能评估20个JSON库在不同操作下的速度、內存用量及代码大小。
[1]: https://github.com/miloyip/nativejson-benchmark
RapidJSON 0.1版本的性能测试文章位于[这里](https://code.google.com/p/rapidjson/wiki/Performance).
此外,你也可以参考以下这些第三方的评测。
## 第三方评测
* [Basic benchmarks for miscellaneous C++ JSON parsers and generators](https://github.com/mloskot/json_benchmark) by Mateusz Loskot (Jun 2013)
* [casablanca](https://casablanca.codeplex.com/)
* [json_spirit](https://github.com/cierelabs/json_spirit)
* [jsoncpp](http://jsoncpp.sourceforge.net/)
* [libjson](http://sourceforge.net/projects/libjson/)
* [rapidjson](https://github.com/miloyip/rapidjson/)
* [QJsonDocument](http://qt-project.org/doc/qt-5.0/qtcore/qjsondocument.html)
* [JSON Parser Benchmarking](http://chadaustin.me/2013/01/json-parser-benchmarking/) by Chad Austin (Jan 2013)
* [sajson](https://github.com/chadaustin/sajson)
* [rapidjson](https://github.com/miloyip/rapidjson/)
* [vjson](https://code.google.com/p/vjson/)
* [YAJL](http://lloyd.github.com/yajl/)
* [Jansson](http://www.digip.org/jansson/)
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
![](doc/logo/rapidjson.png)
Copyright (c) 2011-2014 Milo Yip (miloyip@gmail.com)
[RapidJSON GitHub](https://github.com/miloyip/rapidjson/)
[RapidJSON 文档](http://miloyip.github.io/rapidjson/)
## 简介
RapidJSON是一个C++的JSON解析器及生成器。它的灵感来自[RapidXml](http://rapidxml.sourceforge.net/)
* RapidJSON小而全。它同时支持SAX和DOM风格的API。SAX解析器只有约500行代码。
* RapidJSON快。它的性能可与`strlen()`相比。可支持SSE2/SSE4.1加速。
* RapidJSON独立。它不依赖于BOOST等外部库。它甚至不依赖于STL。
* RapidJSON对内存友好。在大部分32/64位机器上,每个JSON值只占16或20字节(除字符串外)。它预设使用一个快速的内存分配器,令分析器可以紧凑地分配内存。
* RapidJSON对Unicode友好。它支持UTF-8、UTF-16、UTF-32 (大端序/小端序),并内部支持这些编码的检测、校验及转码。例如,RapidJSON可以在分析一个UTF-8文件至DOM时,把当中的JSON字符串转码至UTF-16。它也支持代理对(surrogate pair)及`"\u0000"`(空字符)。
[这里](doc/features.md)可读取更多特点。
JSON(JavaScript Object Notation)是一个轻量的数据交换格式。RapidJSON应该完全遵从RFC7159/ECMA-404。 关于JSON的更多信息可参考:
* [Introducing JSON](http://json.org/)
* [RFC7159: The JavaScript Object Notation (JSON) Data Interchange Format](http://www.ietf.org/rfc/rfc7159.txt)
* [Standard ECMA-404: The JSON Data Interchange Format](http://www.ecma-international.org/publications/standards/Ecma-404.htm)
## 兼容性
RapidJSON是跨平台的。以下是一些曾测试的平台/编译器组合:
* Visual C++ 2008/2010/2013 在 Windows (32/64-bit)
* GNU C++ 3.8.x 在 Cygwin
* Clang 3.4 在 Mac OS X (32/64-bit) 及 iOS
* Clang 3.4 在 Android NDK
用户也可以在他们的平台上生成及执行单元测试。
## 安装
RapidJSON是只有头文件的C++库。只需把`include/rapidjson`目录复制至系统或项目的include目录中。
生成测试及例子的步骤:
1. 执行 `git submodule update --init` 去获取 thirdparty submodules (google test)。
2. 下载 [premake4](http://industriousone.com/premake/download)
3. 复制 premake4 可执行文件至 `rapidjson/build` (或系统路径)。
4. 进入`rapidjson/build/`目录,在Windows下执行`premake.bat`,在Linux或其他平台下执行`premake.sh`
5. 在Windows上,生成位于`rapidjson/build/vs2008/``/vs2010/`内的项目方案.
6. 在其他平台上,在`rapidjson/build/gmake/`目录执行GNU `make`(如 `make -f test.make config=release32``make -f example.make config=debug32`)。
7. 若成功,可执行文件会生成在`rapidjson/bin`目录。
生成[Doxygen](http://doxygen.org)文档的步骤:
1. 下载及安装[Doxygen](http://doxygen.org/download.html)
2. 在顶层目录执行`doxygen build/Doxyfile`
3.`doc/html`浏览文档。
## 用法一览
此简单例子解析一个JSON字符串至一个document (DOM),对DOM作出简单修改,最终把DOM转换(stringify)至JSON字符串。
~~~~~~~~~~cpp
// rapidjson/example/simpledom/simpledom.cpp`
#include "rapidjson/document.h"
#include "rapidjson/writer.h"
#include "rapidjson/stringbuffer.h"
#include <iostream>
using namespace rapidjson;
int main() {
// 1. 把JSON解析至DOM。
const char* json = "{\"project\":\"rapidjson\",\"stars\":10}";
Document d;
d.Parse(json);
// 2. 利用DOM作出修改。
Value& s = d["stars"];
s.SetInt(s.GetInt() + 1);
// 3. 把DOM转换(stringify)成JSON。
StringBuffer buffer;
Writer<StringBuffer> writer(buffer);
d.Accept(writer);
// Output {"project":"rapidjson","stars":11}
std::cout << buffer.GetString() << std::endl;
return 0;
}
~~~~~~~~~~
注意此例子并没有处理潜在错误。
下图展示执行过程。
![simpledom](doc/diagram/simpledom.png)
还有许多[例子](https://github.com/miloyip/rapidjson/tree/master/example)可供参考。
...@@ -57,6 +57,7 @@ doxygen_run() ...@@ -57,6 +57,7 @@ doxygen_run()
{ {
cd "${TRAVIS_BUILD_DIR}"; cd "${TRAVIS_BUILD_DIR}";
doxygen ${TRAVIS_BUILD_DIR}/build/doc/Doxyfile; doxygen ${TRAVIS_BUILD_DIR}/build/doc/Doxyfile;
doxygen ${TRAVIS_BUILD_DIR}/build/doc/Doxyfile.zh-cn;
} }
gh_pages_prepare() gh_pages_prepare()
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment