Commit 669f8b75 authored by zhujiashun's avatar zhujiashun

Complete translation of en/new_protocol.md

parent d1081600
...@@ -102,7 +102,7 @@ typedef void (*ProcessRequest)(InputMessageBase* msg_base); ...@@ -102,7 +102,7 @@ typedef void (*ProcessRequest)(InputMessageBase* msg_base);
### process_response ### process_response
```c++ ```c++
typedef void (*ProcessResponse)(InputMessageBase* msg); typedef void (*ProcessResponse)(InputMessageBase* msg_base);
``` ```
处理client端parse返回的消息,client端必须实现。可能会在和parse()不同的线程中运行。多个process_response可能同时运行。 处理client端parse返回的消息,client端必须实现。可能会在和parse()不同的线程中运行。多个process_response可能同时运行。
...@@ -150,64 +150,3 @@ if (RegisterProtocol(PROTOCOL_HTTP, http_protocol) != 0) { ...@@ -150,64 +150,3 @@ if (RegisterProtocol(PROTOCOL_HTTP, http_protocol) != 0) {
exit(1); exit(1);
} }
``` ```
## r34386引入的不兼容
为了进一步简化protocol的实现逻辑,r34386是一个不兼容改动,主要集中在下面几点:
- ProcessXXX必须在处理结束时调用msg_base->Destroy()。在之前的版本中,这是由框架完成的。这个改动帮助我们隐藏处理EOF的代码(很晦涩),还可以在未来支持更异步的处理(退出ProcessXXX不意味着处理结束)。为了确保所有的退出分支都会调用msg_base->Destroy(),可以使用定义在[destroying_ptr.h](https://github.com/brpc/brpc/blob/master/src/brpc/destroying_ptr.h)中的DestroyingPtr<>,可能像这样:
```c++
void ProcessXXXRequest(InputMessageBase* msg_base) {
DestroyingPtr<MostCommonMessage> msg(static_cast<MostCommonMessage*>(msg_base));
...
}
```
- 具体请参考[其他协议](https://github.com/brpc/brpc/blob/master/src/brpc/policy/baidu_rpc_protocol.cpp)的实现。
- InputMessageBase::socket_id()被移除,而通过socket()可以直接访问到对应Socket的指针。ProcessXXX函数中Address Socket的代码可以移除。
ProcessXXXRequest开头的修改一般是这样:
```c++
void ProcessXXXRequest(InputMessageBase* msg_base) {
const int64_t start_parse_us = butil::cpuwide_time_us();
- MostCommonMessage* msg = static_cast<MostCommonMessage*>(msg_base);
+ DestroyingPtr<MostCommonMessage> msg(static_cast<MostCommonMessage*>(msg_base));
+ SocketUniquePtr socket(msg->ReleaseSocket());
const Server* server = static_cast<const Server*>(msg_base->arg());
ScopedNonServiceError non_service_error(server);
- const SocketId sock = msg_base->socket_id();
- SocketUniquePtr socket;
- if (Socket::Address(sock, &socket) != 0) {
- RPC_VLOG << "Fail to address client=" << sock;
- return;
- }
- if (socket->CheckEOF()) {
- // Received an EOF event
- return;
- }
```
ProcessXXXResponse开头的修改一般是这样:
```c++
void ProcessRpcResponse(InputMessageBase* msg_base) {
const int64_t start_parse_us = butil::cpuwide_time_us();
- MostCommonMessage* msg = static_cast<MostCommonMessage*>(msg_base);
- CheckEOFGuard eof_guard(msg->socket_id());
+ DestroyingPtr<MostCommonMessage> msg(static_cast<MostCommonMessage*>(msg_base));
...
- // After a successful fight, EOF will no longer interrupt the
- // following code. As a result, we can release `eof_guard'
- eof_guard.check();
```
check_eof_guard.h被移除,所以对这个文件的include也得移除:
```c++
- #include "brpc/details/check_eof_guard.h"
```
- AddClientSideHandler被移除,用如下方法代替:
```
- if (AddClientSideHandler(handler) != 0) {
+ if (get_or_new_client_side_messenger()->AddHandler(handler) != 0) {
```
...@@ -4,13 +4,13 @@ brpc server supports all protocols in the same port, and it makes deployment and ...@@ -4,13 +4,13 @@ brpc server supports all protocols in the same port, and it makes deployment and
- First-class protocol: Special characters are marked in front of the protocol data, for example, the data of protocol [baidu_std](baidu_std.md) and hulu_pbrpc begins with 'PRPC' and 'HULU' respectively. Parser just check first four characters to know whether the protocol is matched. This class of protocol is checked first and all these protocols can share one TCP connection. - First-class protocol: Special characters are marked in front of the protocol data, for example, the data of protocol [baidu_std](baidu_std.md) and hulu_pbrpc begins with 'PRPC' and 'HULU' respectively. Parser just check first four characters to know whether the protocol is matched. This class of protocol is checked first and all these protocols can share one TCP connection.
- Second-class protocol: Some complex protocols without special marked characters can only be detected after several input data are parsed. Currently only HTTP is classified into this category. - Second-class protocol: Some complex protocols without special marked characters can only be detected after several input data are parsed. Currently only HTTP is classified into this category.
- Third-class protocol: Special characters are in the middle of the protocol data, such as the magic number of nshead protocol is the 25th-28th characters. It is complex to handle this case because without reading first 28 bytes, we cannot determine whether the protocol is nshead. If it is tried before http, http messages less than 28 bytes may not be parsed, since the parser consider it as an uncomplete nshead message. - Third-class protocol: Special characters are in the middle of the protocol data, such as the magic number of nshead protocol is the 25th-28th characters. It is complex to handle this case because without reading first 28 bytes, we cannot determine whether the protocol is nshead. If it is tried before http, http messages less than 28 bytes may not be parsed, since the parser consider it as an incomplete nshead message.
Considering that there will be only one protocol in most connections, we record the result of last selection so that it will be tried first when further data comes. It reduces the overhead of matching protocols to nearly zero for long connections. Although the process of matching protocols will be run everytime for short connections, the bottleneck of short connections is not in here and this method is still fast enough. If there are lots of new protocols added into brpc in the future, we may consider some heuristic methods to match protocols. Considering that there will be only one protocol in most connections, we record the result of last selection so that it will be tried first when further data comes. It reduces the overhead of matching protocols to nearly zero for long connections. Although the process of matching protocols will be run every time for short connections, the bottleneck of short connections is not in here and this method is still fast enough. If there are lots of new protocols added into brpc in the future, we may consider some heuristic methods to match protocols.
# Multi-protocol support in the client side # Multi-protocol support in the client side
Unlike the server side that protocols must be dynamically determined based on the data on the connection, the client side as the originator, naturally know their own protocol format. As long as the protocol data is sent through connection pool or short connection, which means it has exclusive usage of that connection, then the protocol can have any complex (or bad) format. Since the client will record the protocol when sending the data, it will use that recored protocol to parse the data without any matching overhead when responses come back. There is no magic number in some protocols like memcache, redis, it is hard to distinguish them in the server side, but it is no problem in the client side. Unlike the server side that protocols must be dynamically determined based on the data on the connection, the client side as the originator, naturally know their own protocol format. As long as the protocol data is sent through connection pool or short connection, which means it has exclusive usage of that connection, then the protocol can have any complex (or bad) format. Since the client will record the protocol when sending the data, it will use that recorded protocol to parse the data without any matching overhead when responses come back. There is no magic number in some protocols like memcache, redis, it is hard to distinguish them in the server side, but it has no problem in the client side.
# Support new protocols # Support new protocols
...@@ -52,26 +52,27 @@ enum ProtocolType { ...@@ -52,26 +52,27 @@ enum ProtocolType {
``` ```
## Implement Callbacks ## Implement Callbacks
All callbacks are defined in struct Protocol, which is defined in [protocol.h](https://github.com/brpc/brpc/blob/master/src/brpc/protocol.h). Among all these callbacks, `parse` is a callback that must be implmented. Besides, `process_request` must be implemented in the server side and `serialize_request`, `pack_request`, `process_response` must be implemented in the client side. All callbacks are defined in struct Protocol, which is defined in [protocol.h](https://github.com/brpc/brpc/blob/master/src/brpc/protocol.h). Among all these callbacks, `parse` is a callback that must be implemented. Besides, `process_request` must be implemented in the server side and `serialize_request`, `pack_request`, `process_response` must be implemented in the client side.
It is difficult to implement callbacks of the protocol. These codes is not like the codes that ordinary users use which has good prompts and protections. You have to figure it out how to handle similar code in other protocols and implement your own protocol, then send it to us to do code review. It is difficult to implement callbacks of the protocol. These codes are not like the codes that ordinary users use which has good prompts and protections. You have to figure it out how to handle similar code in other protocols and implement your own protocol, then send it to us to do code review.
### parse ### parse
```c++ ```c++
typedef ParseResult (*Parse)(butil::IOBuf* source, Socket *socket, bool read_eof, const void *arg); typedef ParseResult (*Parse)(butil::IOBuf* source, Socket *socket, bool read_eof, const void *arg);
``` ```
用于把消息从source上切割下来,client端和server端使用同一个parse函数。返回的消息会被递给process_request(server端)或process_response(client端)。 This function is used to cut messages from source. Client side and server side must share the same parse function. The returned message will be passed to `process_request`(server side) or `process_response`(client side).
参数:source是读取到的二进制内容,socket是对应的连接,read_eof为true表示连接已被对端关闭,arg在server端是对应server的指针,在client端是NULL。 Argument: source is the binary content from remote side, socket is the corresponding connection, read_eof is true iff the connection is closed by remote, arg is a pointer to the corresponding server in server client and NULL in client side.
ParseResult可能是错误,也可能包含一个切割下来的message,可能的值有: ParseResult could be an error or a cut message, its possible value contains:
- PARSE_ERROR_TRY_OTHERS :不是这个协议,框架会尝试下一个协议。source不能被消费。 - PARSE_ERROR_TRY_OTHERS :不是这个协议,框架会尝试下一个协议。source不能被消费。
- PARSE_ERROR_NOT_ENOUGH_DATA : 到目前为止数据内容不违反协议,但不构成完整的消息。等到连接上有新数据时,新数据会被append入source并重新调用parse。如果不确定数据是否一定属于这个协议,source不应被消费,如果确定数据属于这个协议,也可以把source的内容转移到内部的状态中去。比如http协议解析中即使source不包含一个完整的http消息,它也会被http parser消费掉,以避免下一次重复解析。 - PARSE_ERROR_TRY_OTHERS: current protocol is not matched, the framework would try next protocol. The data in source cannot be comsumed.
- PARSE_ERROR_TOO_BIG_DATA : 消息太大,拒绝掉以保护server,连接会被关闭。 - PARSE_ERROR_NOT_ENOUGH_DATA: the input data hasn't violated the current protocol yet, but the whole message cannot be detected as well. When there is new data from connection, new data will be appended to source and parse function is called again. If we can determine that data fits current protocol, the content of source can also be transferred to the internal state of protocol. For example, if source doesn't contain a whole http message, it will be consumed by http parser to avoid repeated parsing.
- PARSE_ERROR_NO_RESOURCE : 内部错误,比如资源分配失败。连接会被关闭。 - PARSE_ERROR_TOO_BIG_DATA: message size is too big, the connection will be closed to protect server.
- PARSE_ERROR_ABSOLUTELY_WRONG : 应该是这个协议(比如magic number匹配了),但是格式不符合预期。连接会被关闭。 - PARSE_ERROR_NO_RESOURCE: internal error, such as resource allocation failure. Connections will be closed.
- PARSE_ERROR_ABSOLUTELY_WRONG: it is supposed to be some protocols(magic number is matched), but the format is not as expected. Connection will be closed.
### serialize_request ### serialize_request
```c++ ```c++
...@@ -79,7 +80,7 @@ typedef bool (*SerializeRequest)(butil::IOBuf* request_buf, ...@@ -79,7 +80,7 @@ typedef bool (*SerializeRequest)(butil::IOBuf* request_buf,
Controller* cntl, Controller* cntl,
const google::protobuf::Message* request); const google::protobuf::Message* request);
``` ```
把request序列化进request_buf,client端必须实现。发生在pack_request之前,一次RPC中只会调用一次。cntl包含某些协议(比如http)需要的信息。成功返回true,否则false。 This function is used to serialize request into request_buf that client must implement. It happens before a RPC call and will only be called once. Necessary information needed by some protocols(such as http) is contained in cntl. Return true if succeed, otherwise false.
### pack_request ### pack_request
```c++ ```c++
...@@ -90,54 +91,51 @@ typedef int (*PackRequest)(butil::IOBuf* msg, ...@@ -90,54 +91,51 @@ typedef int (*PackRequest)(butil::IOBuf* msg,
const butil::IOBuf& request_buf, const butil::IOBuf& request_buf,
const Authenticator* auth); const Authenticator* auth);
``` ```
把request_buf打包入msg,每次向server发送消息前(包括重试)都会调用。当auth不为空时,需要打包认证信息。成功返回0,否则-1。 This function is used to pack request_buf into msg, which is called every time before sending messages to server(including retrying). When auth is not NULL, authentication information is also needed to be packed. Return 0 if succeed, otherwise -1.
### process_request ### process_request
```c++ ```c++
typedef void (*ProcessRequest)(InputMessageBase* msg_base); typedef void (*ProcessRequest)(InputMessageBase* msg_base);
``` ```
处理server端parse返回的消息,server端必须实现。可能会在和parse()不同的线程中运行。多个process_request可能同时运行。 This function is used to parse request messages in the server side that server must implement. It and parse() may run in different threads. Multiple process_request may run simultaneously. After the processing is done, msg_base->Destroy() must be called. In order to prevent forgetting calling Destroy, consider using DestroyingPtr<>.
在r34386后必须在处理结束时调用msg_base->Destroy(),为了防止漏调,考虑使用DestroyingPtr<>
### process_response ### process_response
```c++ ```c++
typedef void (*ProcessResponse)(InputMessageBase* msg); typedef void (*ProcessResponse)(InputMessageBase* msg);
``` ```
处理client端parse返回的消息,client端必须实现。可能会在和parse()不同的线程中运行。多个process_response可能同时运行。 This function is used to parse message response in client side that client must implement. It and parse() may run in different threads. Multiple process_request may run simultaneously. After the processing is done, msg_base->Destroy() must be called. In order to prevent forgetting calling Destroy, consider using DestroyingPtr<>.
在r34386后必须在处理结束时调用msg_base->Destroy(),为了防止漏调,考虑使用DestroyingPtr<>
### verify ### verify
```c++ ```c++
typedef bool (*Verify)(const InputMessageBase* msg); typedef bool (*Verify)(const InputMessageBase* msg);
``` ```
处理连接的认证,只会对连接上的第一个消息调用,需要支持认证的server端必须实现,不需要认证或仅支持client端的协议可填NULL。成功返回true,否则false。 This function is used to authenticate connections, it is called when the first message is received. It is must be implemented by servers that need authentication, otherwise the function pointer can be NULL. Return true if succeed, otherwise false.
### parse_server_address ### parse_server_address
```c++ ```c++
typedef bool (*ParseServerAddress)(butil::EndPoint* out, const char* server_addr_and_port); typedef bool (*ParseServerAddress)(butil::EndPoint* out, const char* server_addr_and_port);
``` ```
把server_addr_and_port(Channel.Init的一个参数)转化为butil::EndPoint,可选。一些协议对server地址的表达和理解可能是不同的。 This function converts server_addr_and_port(an argument of Channel.Init) to butil::EndPoint, which is optional. Some protocols may differ in the expression and understanding of server addresses.
### get_method_name ### get_method_name
```c++ ```c++
typedef const std::string& (*GetMethodName)(const google::protobuf::MethodDescriptor* method, typedef const std::string& (*GetMethodName)(const google::protobuf::MethodDescriptor* method,
const Controller*); const Controller*);
``` ```
定制method name,可选。 This function is used to customize method name, which is optional.
### supported_connection_type ### supported_connection_type
标记支持的连接方式。如果支持所有连接方式,设为CONNECTION_TYPE_ALL。如果只支持连接池和短连接,设为CONNECTION_TYPE_POOLED_AND_SHORT。 Used to mark the supported connection method. If all connection methods are supported, this value should set to CONNECTION_TYPE_ALL. If connection pools and short connections are supported, this value should set to CONNECTION_TYPE_POOLED_AND_SHORT.
### name ### name
协议的名称,会出现在各种配置和显示中,越简短越好,必须是字符串常量。 The name of the protocol, which appears in the various configurations and displays, should be as short as possible and must be a string constant.
## Register to global
## 注册到全局 RegisterProtocol should be called to [register implemented protocol](https://github.com/brpc/brpc/blob/master/src/brpc/global.cpp) to brpc, just like:
实现好的协议要调用RegisterProtocol[注册到全局](https://github.com/brpc/brpc/blob/master/src/brpc/global.cpp),以便brpc发现。就像这样:
```c++ ```c++
Protocol http_protocol = { ParseHttpMessage, Protocol http_protocol = { ParseHttpMessage,
SerializeHttpRequest, PackHttpRequest, SerializeHttpRequest, PackHttpRequest,
......
...@@ -52,8 +52,7 @@ public: ...@@ -52,8 +52,7 @@ public:
// remove these logs in performance-sensitive servers. // remove these logs in performance-sensitive servers.
// You should also noticed that these logs are different from what // You should also noticed that these logs are different from what
// we wrote in other projects: they use << instead of printf-style // we wrote in other projects: they use << instead of printf-style
// functions. But don't worry, these logs are fully compatible with // functions.
// comlog. You can mix them with comlog or ullog functions freely.
// The noflush prevents the log from being flushed immediately. // The noflush prevents the log from being flushed immediately.
LOG(INFO) << "Received request[index=" << request->index() LOG(INFO) << "Received request[index=" << request->index()
<< "] from " << cntl->remote_side() << "] from " << cntl->remote_side()
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment