Commit c9039fd4 authored by gejun's avatar gejun

Complete translation of client.md

parent 16fbb258
......@@ -531,9 +531,9 @@ brpc支持以下连接方式:
- 单连接:进程内所有client与一台server最多只有一个连接,一个连接上可能同时有多个请求,回复返回顺序和请求顺序不需要一致,这是baidu_std,hulu_pbrpc,sofa_pbrpc协议的默认选项。
| | 短连接 | 连接池 | 单连接 |
| ---------- | ---------------------------------------- | --------------------- | ------------------- |
| ------------------- | ---------------------------------------- | --------------------- | ------------------- |
| 长连接 | 否 | 是 | 是 |
| server端连接数 | qps*latency (原理见[little's law](https://en.wikipedia.org/wiki/Little%27s_law)) | qps*latency | 1 |
| server端连接数(单client) | qps*latency (原理见[little's law](https://en.wikipedia.org/wiki/Little%27s_law)) | qps*latency | 1 |
| 极限qps | 差,且受限于单机端口数 | 中等 | 高 |
| latency | 1.5RTT(connect) + 1RTT + 处理时间 | 1RTT + 处理时间 | 1RTT + 处理时间 |
| cpu占用 | 高, 每次都要tcp connect | 中等, 每个请求都要一次sys write | 低, 合并写出在大流量时减少cpu占用 |
......@@ -590,7 +590,7 @@ brpc支持[Streaming RPC](streaming_rpc.md),这是一种应用层的连接,
## log_id
通过set_log_id()可设置log_id。这个id会被送到服务器端,一般会被打在日志里,从而把一次检索经过的所有服务串联起来。不同产品线可能有不同的叫法。一些产品线有字符串格式的“s值”,内容也是64位的16进制数,可以转成整型后再设入log_id。
通过set_log_id()可设置64位整型log_id。这个id会和请求一起被送到服务器端,一般会被打在日志里,从而把一次检索经过的所有服务串联起来。字符串格式的需要转化为64位整形才能设入log_id。
## 附件
......@@ -598,23 +598,8 @@ baidu_std和hulu_pbrpc协议支持附件,这段数据由用户自定义,不
在http协议中,附件对应[message body](http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html),比如要POST的数据就设置在request_attachment()中。
## giano认证
```
// Create a baas::CredentialGenerator using Giano's API
baas::CredentialGenerator generator = CREATE_MOCK_PERSONAL_GENERATOR(
"mock_user", "mock_roles", "mock_group", baas::sdk::BAAS_OK);
// Create a brpc::policy::GianoAuthenticator using the generator we just created
// and then pass it into brpc::ChannelOptions
brpc::policy::GianoAuthenticator auth(&generator, NULL);
brpc::ChannelOptions option;
option.auth = &auth;
```
首先通过调用Giano API生成验证器baas::CredentialGenerator,具体可参看[Giano快速上手手册.pdf](http://wiki.baidu.com/download/attachments/37774685/Giano%E5%BF%A
B%E9%80%9F%E4%B8%8A%E6%89%8B%E6%89%8B%E5%86%8C.pdf?version=1&modificationDate=1421990746000&api=v2)。然后按照如上代码一步步将其设置到brpc::ChannelOptions里去。
当client设置认证后,任何一个新连接建立后都必须首先发送一段验证信息(通过Giano认证器生成),才能发送后续请求。认证成功后,该连接上的后续请求不会再带有验证消息。
## 认证
TODO: Describe how authentication methods are extended.
## 重置
......@@ -624,7 +609,11 @@ B%E9%80%9F%E4%B8%8A%E6%89%8B%E6%89%8B%E5%86%8C.pdf?version=1&modificationDate=14
## 压缩
set_request_compress_type()设置request的压缩方式,默认不压缩。注意:附件不会被压缩。HTTP body的压缩方法见[client压缩request body](http_client#压缩request-body)
set_request_compress_type()设置request的压缩方式,默认不压缩。
注意:附件不会被压缩。
HTTP body的压缩方法见[client压缩request body](http_client#压缩request-body)
支持的压缩方法有:
......@@ -670,13 +659,13 @@ set_request_compress_type()设置request的压缩方式,默认不压缩。注
### Q: brpc能用unix domain socket吗
不能。因为同机socket并不走网络,相比domain socket性能只会略微下降,替换为domain socket意义不大。以后可能会扩展支持。
不能。同机TCP socket并不走网络,相比unix domain socket性能只会略微下降。一些不能用TCP socket的特殊场景可能会需要,以后可能会扩展支持。
### Q: Fail to connect to xx.xx.xx.xx:xxxx, Connection refused是什么意思
### Q: Fail to connect to xx.xx.xx.xx:xxxx, Connection refused
一般是对端server没打开端口(很可能挂了)。
### Q: 经常遇到Connection timedout(不在一个机房)
### Q: 经常遇到至另一个机房的Connection timedout
![img](../images/connection_timedout.png)
......@@ -700,33 +689,29 @@ struct ChannelOptions {
};
```
注意连接超时不是RPC超时,RPC超时打印的日志是"Reached timeout=..."。
注意: 连接超时不是RPC超时,RPC超时打印的日志是"Reached timeout=..."。
### Q: 为什么同步方式是好的,异步就crash了
重点检查Controller,Response和done的生命周期。在异步访问中,RPC调用结束并不意味着RPC整个过程结束,而是要在done被调用后才会结束。所以这些对象不应在调用RPC后就释放,而是要在done里面释放。所以你一般不能把这些对象分配在栈上,而应该使用NewCallback等方式分配在堆上。详见[异步访问](client.md#异步访问)
### Q: 我怎么确认server处理了我的请求
不一定能。当response返回且成功时,我们确认这个过程一定成功了。当response返回且失败时,我们确认这个过程一定失败了。但当response没有返回时,它可能失败,也可能成功。如果我们选择重试,那一个成功的过程也可能会被再执行一次。所以一般来说RPC服务都应当考虑[幂等](http://en.wikipedia.org/wiki/Idempotence)问题,否则重试可能会导致多次叠加副作用而产生意向不到的结果。比如以读为主的检索服务大都没有副作用而天然幂等,无需特殊处理。而像写也很多的存储服务则要在设计时就加入版本号或序列号之类的机制以拒绝已经发生的过程,保证幂等。
重点检查Controller,Response和done的生命周期。在异步访问中,RPC调用结束并不意味着RPC整个过程结束,而是在进入done->Run()时才会结束。所以这些对象不应在调用RPC后就释放,而是要在done->Run()里释放。你一般不能把这些对象分配在栈上,而应该分配在堆上。详见[异步访问](client.md#异步访问)
### Q: BNS中机器列表已经配置了,但是RPC报"Fail to select server, No data available"错误
### Q: 怎么确保请求只被处理一次
使用get_instance_by_service -s your_bns_name 来检查一下所有机器的status状态, 只有status为0的机器才能被client访问.
这不是RPC层面的事情。当response返回且成功时,我们确认这个过程一定成功了。当response返回且失败时,我们确认这个过程一定失败了。但当response没有返回时,它可能失败,也可能成功。如果我们选择重试,那一个成功的过程也可能会被再执行一次。一般来说带副作用的RPC服务都应当考虑[幂等](http://en.wikipedia.org/wiki/Idempotence)问题,否则重试可能会导致多次叠加副作用而产生意向不到的结果。只有读的检索服务大都没有副作用而天然幂等,无需特殊处理。而带写的存储服务则要在设计时就加入版本号或序列号之类的机制以拒绝已经发生的过程,保证幂等。
### Q: Invalid address=`bns://group.user-persona.dumi.nj03'是什么意思
### Q: Invalid address=`bns://group.user-persona.dumi.nj03'
```
FATAL 04-07 20:00:03 7778 src/brpc/channel.cpp:123] Invalid address=`bns://group.user-persona.dumi.nj03'. You should use Init(naming_service_name, load_balancer_name, options) to access multiple servers.
```
访问bns要使用三个参数的Init,它第二个参数是load_balancer_name,而你这里用的是两个参数的Init,框架当你是访问单点,就会报这个错。
访问名字服务要使用三个参数的Init,其中第二个参数是load_balancer_name,而这里用的是两个参数的Init,框架认为是访问单点,就会报这个错。
### Q: 两个产品线都使用protobuf,为什么不能互相访问
### Q: 两端都用protobuf,为什么不能互相访问
协议 !=protobuf。protobuf负责打包,协议负责定字段。打包格式相同不意味着字段可以互通。协议中可能会包含多个protobuf包,以及额外的长度、校验码、magic number等等。协议的互通是通过在RPC框架内转化为统一的编程接口完成的,而不是在protobuf层面。从广义上来说,protobuf也可以作为打包框架使用,生成其他序列化格式的包,像[idl<=>protobuf](mcpack2pb.md)就是通过protobuf生成了解析idl的代码
**协议 !=protobuf**。protobuf负责一个包的序列化,协议中的一个消息可能会包含多个protobuf包,以及额外的长度、校验码、magic number等等。打包格式相同不意味着协议可以互通。在brpc中写一份代码就能服务多协议的能力是通过把不同协议的数据转化为统一的编程接口完成的,而不是在protobuf层面
### Q: 为什么C++ client/server 能够互相通信, 和其他语言的client/server 通信会报序列化失败的错误
检查一下C++ 版本是否开启了压缩 (Controller::set_compress_type), 目前 python/JAVA版的rpc框架还没有实现压缩,互相返回会出现问题。
检查一下C++ 版本是否开启了压缩 (Controller::set_compress_type), 目前其他语言的rpc框架还没有实现压缩,互相返回会出现问题。
# 附:Client端基本流程
......@@ -737,13 +722,13 @@ FATAL 04-07 20:00:03 7778 src/brpc/channel.cpp:123] Invalid address=`bns://group
1. 创建一个[bthread_id](https://github.com/brpc/brpc/blob/master/src/bthread/id.h)作为本次RPC的correlation_id。
2. 根据Channel的创建方式,从进程级的[SocketMap](https://github.com/brpc/brpc/blob/master/src/brpc/socket_map.h)中或从[LoadBalancer](https://github.com/brpc/brpc/blob/master/src/brpc/load_balancer.h)中选择一台下游server作为本次RPC发送的目的地。
3. 根据连接方式(单连接、连接池、短连接),选择一个[Socket](https://github.com/brpc/brpc/blob/master/src/brpc/socket.h)
4. 如果开启验证且当前Socket没有被验证过时,第一个请求进入验证分支,其余请求会阻塞直到第一个包含认证信息的请求写入Socket。这是因为server端只对第一个请求进行验证。
4. 如果开启验证且当前Socket没有被验证过时,第一个请求进入验证分支,其余请求会阻塞直到第一个包含认证信息的请求写入Socket。server端只对第一个请求进行验证。
5. 根据Channel的协议,选择对应的序列化函数把request序列化至[IOBuf](https://github.com/brpc/brpc/blob/master/src/butil/iobuf.h)
6. 如果配置了超时,设置定时器。从这个点开始要避免使用Controller对象,因为在设定定时器后->有可能触发超时机制->调用到用户的异步回调->用户在回调中析构Controller。
6. 如果配置了超时,设置定时器。从这个点开始要避免使用Controller对象,因为在设定定时器后随时可能触发超时->调用到用户的超时回调->用户在回调中析构Controller。
7. 发送准备阶段结束,若上述任何步骤出错,会调用Channel::HandleSendFailed。
8. 将之前序列化好的IOBuf写出到Socket上,同时传入回调Channel::HandleSocketFailed,当连接断开、写失败等错误发生时会调用此回调。
9. 如果是同步发送,Join correlation_id;如果是异步则至此client端返回
9. 如果是同步发送,Join correlation_id;否则至此CallMethod结束
10. 网络上发消息+收消息。
11. 收到response后,提取出其中的correlation_id,在O(1)时间内找到对应的Controller。这个过程中不需要查找全局哈希表,有良好的多核扩展性。
12. 根据协议格式反序列化response。
13. 调用Controller::OnRPCReturned,其中会根据错误码判断是否需要重试。如果是异步发送,调用用户回调。最后摧毁correlation_id唤醒Join着的线程。
13. 调用Controller::OnRPCReturned,可能会根据错误码判断是否需要重试,或让RPC结束。如果是异步发送,调用用户回调。最后摧毁correlation_id唤醒Join着的线程。
......@@ -176,7 +176,7 @@ Servers whose connections are lost are isolated temporarily to prevent them from
| Name | Value | Description | Defined At |
| ------------------------- | ----- | ---------------------------------------- | ----------------------- |
| health_check_interval (R) | 3 | seconds between consecutive health-checkings | src/brpc/socket_map.cpp |
| health_check_interval (R) | 3 | seconds between consecutive health-checkings | src/brpc/socket_map.cpp |
Once a server is connected, it resumes as a server candidate inside LoadBalancer. If a server is removed from NamingService during health-checking, brpc removes it from health-checking as well.
......@@ -448,7 +448,7 @@ Controller.has_backup_request() tells if backup_request was sent.
**servers tried before are not retried by best efforts**
Conditions for retrying (AND relations):
Conditions for retrying (AND relations):
- Broken connection.
- Timeout is not reached.
- Has retrying quota. Controller.set_max_retry(0) or ChannelOptions.max_retry = 0 disables retries.
......@@ -534,13 +534,13 @@ The default protocol used by Channel is baidu_std, which is changeable by settin
brpc supports following connection types:
- short connection: Established before each RPC, closed after completion. Since each RPC has to pay the overhead of establishing connection, this type is used for occasionally launched RPC, not frequently launched ones. No protocol use this type by default. Connections in http 1.0 are handled similarly as short connections.
- pooled connection: Pick an idle connection from a pool before each RPC, return after completion. One connection carries at most one request at the same time. One client may have multiple connections to one server. http and the protocols using nshead use this type by default.
- single connection: all clients in one process has at most one connection to one server, one connection may carry multiple requests at the same time. The sequence of returning responses does not need to be same as sending requests. This type is used by baidu_std, hulu_pbrpc, sofa_pbrpc by default.
- pooled connection: Pick an unused connection from a pool before each RPC, return after completion. One connection carries at most one request at the same time. One client may have multiple connections to one server. http and the protocols using nshead use this type by default.
- single connection: all clients in one process has at most one connection to one server, one connection may carry multiple requests at the same time. The sequence of received responses does not need to be same as sending requests. This type is used by baidu_std, hulu_pbrpc, sofa_pbrpc by default.
| | short connection | pooled connection | single connection |
| --------------------------- | ---------------------------------------- | --------------------------------------- | ---------------------------------------- |
| ---------------------------------------- | ---------------------------------------- | --------------------------------------- | ---------------------------------------- |
| long connection | no | yes | yes |
| \#connection at server-side | qps*latency ([little's law](https://en.wikipedia.org/wiki/Little%27s_law)) | qps*latency | 1 |
| \#connection at server-side (from a client) | qps*latency ([little's law](https://en.wikipedia.org/wiki/Little%27s_law)) | qps*latency | 1 |
| peak qps | bad, and limited by max number of ports | medium | high |
| latency | 1.5RTT(connect) + 1RTT + processing time | 1RTT + processing time | 1RTT + processing time |
| cpu usage | high, tcp connect for each RPC | medium, every request needs a sys write | low, writes can be combined to reduce overhead. |
......@@ -549,7 +549,7 @@ brpc chooses best connection type for the protocol by default, users generally h
- CONNECTION_TYPE_SINGLE or "single" : single connection
- CONNECTION_TYPE_POOLED or "pooled": pooled connection. Max number of connections to one server is limited by -max_connection_pool_size:
- CONNECTION_TYPE_POOLED or "pooled": pooled connection. Max number of connections from one client to one server is limited by -max_connection_pool_size:
| Name | Value | Description | Defined At |
| ---------------------------- | ----- | ---------------------------------------- | ------------------- |
......@@ -582,13 +582,13 @@ Another solution is setting gflag -defer_close_second
| ------------------ | ----- | ---------------------------------------- | ----------------------- |
| defer_close_second | 0 | Defer close of connections for so many seconds even if the connection is not used by anyone. Close immediately for non-positive values | src/brpc/socket_map.cpp |
After setting, connection is not closed immediately after last referential count, instead it will be closed after so many seconds. If a channel references the connection again during the wait, the connection resumes to normal. No matter how frequent channels are created, this flag limits the frequency of closing connections. Side effect of the flag is that file descriptors are not closed immediately after destroying of channels, if the flag is wrongly set to be large, number of used file descriptors in the process may be large as well.
After setting, connection is not closed immediately after last referential count, instead it will be closed after so many seconds. If a channel references the connection again during the wait, the connection resumes to normal. No matter how frequent channels are created, this flag limits the frequency of closing connections. Side effect of the flag is that file descriptors are not closed immediately after destroying of channels, if the flag is wrongly set to be large, number of active file descriptors in the process may be large as well.
## 连接的缓冲区大小
## Buffer size of connections
-socket_recv_buffer_size设置所有连接的接收缓冲区大小, 默认-1(不修改)
-socket_recv_buffer_size sets receiving buffer size of all connections, -1 by default (not modified)
-socket_send_buffer_size设置所有连接的发送缓冲区大小, 默认-1(不修改)
-socket_send_buffer_size sets sending buffer size of all connections, -1 by default (not modified)
| Name | Value | Description | Defined At |
| ----------------------- | ----- | ---------------------------------------- | ------------------- |
......@@ -597,49 +597,38 @@ After setting, connection is not closed immediately after last referential coun
## log_id
通过set_log_id()可设置log_id. 这个id会被送到服务器端, 一般会被打在日志里, 从而把一次检索经过的所有服务串联起来. 不同产品线可能有不同的叫法. 一些产品线有字符串格式的"s值", 内容也是64位的16进制数, 可以转成整型后再设入log_id.
set_log_id() sets a 64-bit integral log_id, which is sent to the server-side along with the request, and often printed in server logs to associate different services accessed in a session. String-type log-id must be converted to 64-bit integer before setting.
## 附件
## Attachment
标准协议和hulu协议支持附件, 这段数据由用户自定义, 不经过protobuf的序列化. 站在client的角度, 设置在Controller::request_attachment()的附件会被server端收到, response_attachment()则包含了server端送回的附件. 附件不受压缩选项影响.
baidu_std and hulu_pbrpc supports attachment, which is set by user to bypass serialization of protobuf. As a client, the data in Controller::request_attachment() will be received by the server and response_attachment() contains attachment sent back by the server. Attachment is not compressed by brpc.
在http协议中, 附件对应[message body](http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html), 比如要POST的数据就设置在request_attachment()中.
In http, attachment corresponds to [message body](http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html), namely the data to post is stored in request_attachment().
## giano认证
```
// Create a baas::CredentialGenerator using Giano's API
baas::CredentialGenerator generator = CREATE_MOCK_PERSONAL_GENERATOR(
"mock_user", "mock_roles", "mock_group", baas::sdk::BAAS_OK);
## Authentication
TODO: Describe how authentication methods are extended.
// Create a brpc::policy::GianoAuthenticator using the generator we just created
// and then pass it into brpc::ChannelOptions
brpc::policy::GianoAuthenticator auth(&generator, NULL);
brpc::ChannelOptions option;
option.auth = &auth;
```
首先通过调用Giano API生成验证器baas::CredentialGenerator, 具体可参看[Giano快速上手手册.pdf](http://wiki.baidu.com/download/attachments/37774685/Giano%E5%BF%A
B%E9%80%9F%E4%B8%8A%E6%89%8B%E6%89%8B%E5%86%8C.pdf?version=1&modificationDate=1421990746000&api=v2). 然后按照如上代码一步步将其设置到brpc::ChannelOptions里去.
## Reset
当client设置认证后, 任何一个新连接建立后都必须首先发送一段验证信息(通过Giano认证器生成), 才能发送后续请求. 认证成功后, 该连接上的后续请求不会再带有验证消息.
This method makes Controller back to the state as if it's just created.
## 重置
Don't call Reset() during a RPC, which is undefined.
调用Reset方法可让Controller回到刚创建时的状态.
## Compression
别在RPC结束前重置Controller, 行为是未定义的.
set_request_compress_type() sets compress-type of the request, no compression by default.
## 压缩
NOTE: Attachment is not compressed by brpc.
set_request_compress_type()设置request的压缩方式, 默认不压缩. 注意: 附件不会被压缩. HTTP body的压缩方法见[client压缩request body](http_client#压缩request-body).
Check out [compress request body](http_client#压缩request-body) to compress http body.
支持的压缩方法有:
Supported compressions:
- brpc::CompressTypeSnappy : [snanpy压缩](http://google.github.io/snappy/), 压缩和解压显著快于其他压缩方法, 但压缩率最低.
- brpc::CompressTypeGzip : [gzip压缩](http://en.wikipedia.org/wiki/Gzip), 显著慢于snappy, 但压缩率高
- brpc::CompressTypeZlib : [zlib压缩](http://en.wikipedia.org/wiki/Zlib), 比gzip快10%~20%, 压缩率略好于gzip, 但速度仍明显慢于snappy.
- brpc::CompressTypeSnappy : [snanpy](http://google.github.io/snappy/), compression and decompression are very fast, but compression ratio is low.
- brpc::CompressTypeGzip : [gzip](http://en.wikipedia.org/wiki/Gzip), significantly slower than snappy, with a higher compression ratio.
- brpc::CompressTypeZlib : [zlib](http://en.wikipedia.org/wiki/Zlib), 10%~20% faster than gzip but still significantly slower than snappy, with slightly better compression ratio than gzip.
下表是多种压缩算法应对重复率很高的数据时的性能, 仅供参考.
Following table lists performance of different methods compressing and decompressing **data with a lot of duplications**, just for reference.
| Compress method | Compress size(B) | Compress time(us) | Decompress time(us) | Compress throughput(MB/s) | Decompress throughput(MB/s) | Compress ratio |
| --------------- | ---------------- | ----------------- | ------------------- | ------------------------- | --------------------------- | -------------- |
......@@ -656,7 +645,7 @@ set_request_compress_type()设置request的压缩方式, 默认不压缩. 注意
| Gzip | 229.7803 | 82.71903 | 135.9995 | 377.7849 | 0.54% | |
| Zlib | 240.7464 | 54.44099 | 129.8046 | 574.0161 | 0.50% | |
下表是多种压缩算法应对重复率很低的数据时的性能, 仅供参考.
Following table lists performance of different methods compressing and decompressing **data with very few duplications**, just for reference.
| Compress method | Compress size(B) | Compress time(us) | Decompress time(us) | Compress throughput(MB/s) | Decompress throughput(MB/s) | Compress ratio |
| --------------- | ---------------- | ----------------- | ------------------- | ------------------------- | --------------------------- | -------------- |
......@@ -675,19 +664,19 @@ set_request_compress_type()设置request的压缩方式, 默认不压缩. 注意
# FAQ
### Q: brpc能用unix domain socket吗
### Q: Does brpc support unix domain socket?
不能. 因为同机socket并不走网络, 相比domain socket性能只会略微下降, 替换为domain socket意义不大. 以后可能会扩展支持.
No. Local TCP sockets performs just a little slower than unix domain socket since traffic over local TCP sockets bypasses network. Some scenarios where TCP sockets can't be used may require unix domain sockets. We may consider the capability in future.
### Q: Fail to connect to xx.xx.xx.xx:xxxx, Connection refused是什么意思
### Q: Fail to connect to xx.xx.xx.xx:xxxx, Connection refused
一般是对端server没打开端口(很可能挂了).
The remote server does not serve any more (probably crashed).
### Q: 经常遇到Connection timedout(不在一个机房)
### Q: often met Connection timedout to another IDC
![img](../images/connection_timedout.png)
这个就是连接超时了, 调大连接和RPC超时:
The TCP connection is not established within connection_timeout_ms, you have to tweak options:
```c++
struct ChannelOptions {
......@@ -707,50 +696,46 @@ struct ChannelOptions {
};
```
注意连接超时不是RPC超时, RPC超时打印的日志是"Reached timeout=...".
### Q: 为什么同步方式是好的, 异步就crash了
重点检查Controller, Response和done的生命周期. 在异步访问中, RPC调用结束并不意味着RPC整个过程结束, 而是要在done被调用后才会结束. 所以这些对象不应在调用RPC后就释放, 而是要在done里面释放. 所以你一般不能把这些对象分配在栈上, 而应该使用NewCallback等方式分配在堆上. 详见[异步访问](client.md#异步访问).
NOTE: Connection timeout is not RPC timeout, which is printed as "Reached timeout=...".
### Q: 我怎么确认server处理了我的请求
### Q: synchronous call is good, asynchronous call crashes
不一定能. 当response返回且成功时, 我们确认这个过程一定成功了. 当response返回且失败时, 我们确认这个过程一定失败了. 但当response没有返回时, 它可能失败, 也可能成功. 如果我们选择重试, 那一个成功的过程也可能会被再执行一次. 所以一般来说RPC服务都应当考虑[幂等](http://en.wikipedia.org/wiki/Idempotence)问题, 否则重试可能会导致多次叠加副作用而产生意向不到的结果. 比如以读为主的检索服务大都没有副作用而天然幂等, 无需特殊处理. 而像写也很多的存储服务则要在设计时就加入版本号或序列号之类的机制以拒绝已经发生的过程, 保证幂等.
Check lifetime of Controller, Response and done. In asynchronous call, finish of CallMethod is not completion of RPC which is entering of done->Run(). So the objects should not deleted just after CallMethod, instead they should be delete in done->Run(). Generally you should allocate the objects on heap instead of putting them on stack. Check out [Asynchronous call](client.md#asynchronous-call) for details.
### Q: BNS中机器列表已经配置了,但是RPC报"Fail to select server, No data available"错误
### Q: How to make requests be processed once and only once
使用get_instance_by_service -s your_bns_name 来检查一下所有机器的status状态, 只有status为0的机器才能被client访问.
This issue is not solved on RPC layer. When response returns and being successful, we know the RPC is processed at server-side. When response returns and being rejected, we know the RPC is not processed at server-side. But when response is not returned, server may or may not process the RPC. If we retry, same request may be processed twice at server-side. Generally RPC services with side effects must consider [idempotence](http://en.wikipedia.org/wiki/Idempotence) of the service, otherwise retries may make side effects be done more than once and result in unexpected behavior. Search services with only read often have no side effects (during a search), being idempotent natually. But storage services that need to write have to design versioning or serial-number mechanisms to reject side effects that already happen, to keep idempoent.
### Q: Invalid address=`bns://group.user-persona.dumi.nj03'是什么意思
### Q: Invalid address=`bns://group.user-persona.dumi.nj03'
```
FATAL 04-07 20:00:03 7778 src/brpc/channel.cpp:123] Invalid address=`bns://group.user-persona.dumi.nj03'. You should use Init(naming_service_name, load_balancer_name, options) to access multiple servers.
```
访问bns要使用三个参数的Init, 它第二个参数是load_balancer_name, 而你这里用的是两个参数的Init, 框架当你是访问单点, 就会报这个错.
Accessing servers under naming service needs the Init() with 3 parameters(the second param is `load_balancer_name`). The Init() here is with 2 parameters and treated by brpc as accessing single server, producing the error.
### Q: 两个产品线都使用protobuf, 为什么不能互相访问
### Q: Both sides use protobuf, why can't they communicate with each other
协议 !=protobuf. protobuf负责打包, 协议负责定字段. 打包格式相同不意味着字段可以互通. 协议中可能会包含多个protobuf包, 以及额外的长度、校验码、magic number等等. 协议的互通是通过在RPC框架内转化为统一的编程接口完成的, 而不是在protobuf层面. 从广义上来说, protobuf也可以作为打包框架使用, 生成其他序列化格式的包, 像[idl<=>protobuf](mcpack2pb.md)就是通过protobuf生成了解析idl的代码.
**protocol != protobuf**. protobuf serializes one package and a message of a protocol may contain multiple packages along with extra lengths, checksums, magic numbers. The capability offered by brpc that "write code once and serve multiple protocols" is implemented by converting data from different protocols to unified API, not on protobuf layer.
### Q: 为什么C++ client/server 能够互相通信, 和其他语言的client/server 通信会报序列化失败的错误
### Q: Why C++ client/server may fail to talk to client/server in other languages
检查一下C++ 版本是否开启了压缩 (Controller::set_compress_type), 目前 python/JAVA版的rpc框架还没有实现压缩, 互相返回会出现问题.
Check if the C++ version turns on compression (Controller::set_compress_type), Currently RPC impl. in other languages do not support compression yet.
# 附:Client端基本流程
# PS: Workflow at Client-side
![img](../images/client_side.png)
主要步骤:
1. 创建一个[bthread_id](https://github.com/brpc/brpc/blob/master/src/bthread/id.h)作为本次RPC的correlation_id.
2. 根据Channel的创建方式, 从进程级的[SocketMap](https://github.com/brpc/brpc/blob/master/src/brpc/socket_map.h)中或从[LoadBalancer](https://github.com/brpc/brpc/blob/master/src/brpc/load_balancer.h)中选择一台下游server作为本次RPC发送的目的地.
3. 根据连接方式(单连接、连接池、短连接), 选择一个[Socket](https://github.com/brpc/brpc/blob/master/src/brpc/socket.h).
4. 如果开启验证且当前Socket没有被验证过时, 第一个请求进入验证分支, 其余请求会阻塞直到第一个包含认证信息的请求写入Socket. 这是因为server端只对第一个请求进行验证.
5. 根据Channel的协议, 选择对应的序列化函数把request序列化至[IOBuf](https://github.com/brpc/brpc/blob/master/src/butil/iobuf.h).
6. 如果配置了超时, 设置定时器. 从这个点开始要避免使用Controller对象, 因为在设定定时器后->有可能触发超时机制->调用到用户的异步回调->用户在回调中析构Controller.
7. 发送准备阶段结束, 若上述任何步骤出错, 会调用Channel::HandleSendFailed.
8. 将之前序列化好的IOBuf写出到Socket上, 同时传入回调Channel::HandleSocketFailed, 当连接断开、写失败等错误发生时会调用此回调.
9. 如果是同步发送, Join correlation_id;如果是异步则至此client端返回.
10. 网络上发消息+收消息.
11. 收到response后, 提取出其中的correlation_id, 在O(1)时间内找到对应的Controller. 这个过程中不需要查找全局哈希表, 有良好的多核扩展性.
12. 根据协议格式反序列化response.
13. 调用Controller::OnRPCReturned, 其中会根据错误码判断是否需要重试. 如果是异步发送, 调用用户回调. 最后摧毁correlation_id唤醒Join着的线程.
Steps:
1. Create a [bthread_id](https://github.com/brpc/brpc/blob/master/src/bthread/id.h) as correlation_id of current RPC.
2. According to how the Channel is initialized, choose a server from global [SocketMap](https://github.com/brpc/brpc/blob/master/src/brpc/socket_map.h) or [LoadBalancer](https://github.com/brpc/brpc/blob/master/src/brpc/load_balancer.h) as destination of the request.
3. Choose a [Socket](https://github.com/brpc/brpc/blob/master/src/brpc/socket.h) according to connection type (single, pooled, short)
4. If authentication is turned on and the Socket is not authenticated yet, first request enters authenticating branch, other requests block until the branch writes authenticating information into the Socket. Server-side only verifies the first request.
5. According to protocol of the Channel, choose corresponding serialization callback to serialize request into [IOBuf](https://github.com/brpc/brpc/blob/master/src/butil/iobuf.h).
6. If timeout is set, setup timer. From this point on, avoid using Controller, since the timer may be triggered at anytime and calls user's callback for timeout, which may delete Controller.
7. Sending phase is completed. If error occurs at any step, Channel::HandleSendFailed is called.
8. Write IOBuf with serialized data into the Socket and add Channel::HandleSocketFailed into id_wait_list of the Socket. The callback will be called when the write is failed or connection is broken before completion of RPC.
9. In synchronous call, Join correlation_id; otherwise CallMethod() returns.
10. Send/receive messages to/from network.
11. After receiving response, get the correlation_id inside, find out associated Controller within O(1) time. The lookup does not need to lock a global hashmap, and scales well.
12. Parse response according to the protocol
13. Call Controller::OnRPCReturned, which may retry errorous RPC, or complete the RPC. Call user's done in asynchronous call. Destroy correlation_id and wakeup joining threads.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment