Commit 5448712c authored by gejun's avatar gejun

Reviewed status.md

parent d5f38be3
[English version](../en/status.md)
[/status](http://brpc.baidu.com:8765/status)可以访问服务的主要统计信息。这些信息和/vars是同源的,但按服务重新组织方便查看。 [/status](http://brpc.baidu.com:8765/status)可以访问服务的主要统计信息。这些信息和/vars是同源的,但按服务重新组织方便查看。
![img](../images/status.png) ![img](../images/status.png)
上图中字段的含义分别是: 上图中字段的含义分别是:
- **non_service_error**: "non"修饰的是“service_error",后者即是分列在各个服务下的error,此外的error都计入non_service_error。服务处理过程中client断开连接导致无法成功写回response就算non_service_error。而服务内部对后端的连接断开属于服务内部逻辑,只要最终服务成功地返回了response,即使错误也是计入该服务的error,而不是non_service_error - **non_service_error**: 在service处理过程之外的错误个数。比如client断开连接导致server无法成功写回response算*non_service_error*,此时service处理已结束。作为对比,服务过程中对后端服务的访问错误不是*non_service_error*。即使写出的response代表错误,此error也被记入对应的service,而不是*non_service_error*
- **connection_count**: 向该server发起请求的连接个数,不包含[对外连接](http://brpc.baidu.com:8765/vars/rpc_channel_connection_count)的个数。 - **connection_count**: 向该server发起请求的连接个数。不包含记录在/vars/rpc_channel_connection_count的对外连接的个数。
- **example.EchoService**: 服务的完整名称,包含名字空间 - **example.EchoService**: 服务的完整名称,包含proto中的包名
- **Echo (EchoRequest) returns (EchoResponse)**: 方法签名,一个服务可包含多个方法,点击request/response上的链接可查看对应的protobuf结构体。 - **Echo (EchoRequest) returns (EchoResponse)**: 方法签名,一个服务可包含多个方法,点击request/response上的链接可查看对应的protobuf结构体。
- **count**: 成功处理的请求总个数。 - **count**: 成功处理的请求总个数。
- **error**: 失败的请求总个数。 - **error**: 失败的请求总个数。
- **latency**: 在web界面下从右到左分别是过去60秒,60分钟,24小时,30天的平均延时。在文本界面下是10秒内([-bvar_dump_interval](http://brpc.baidu.com:8765/flags/bvar_dump_interval)控制)的平均延时。 - **latency**: 在html下是*从右到左*分别是过去60秒,60分钟,24小时,30天的平均延时。纯文本下是10秒内([-bvar_dump_interval](http://brpc.baidu.com:8765/flags/bvar_dump_interval)控制)的平均延时。
- **latency_percentiles**: 是延时的50%, 90%, 99%, 99.9%分位值,统计窗口默认10秒([-bvar_dump_interval](http://brpc.baidu.com:8765/flags/bvar_dump_interval)控制),web界面下有曲线。 - **latency_percentiles**: 是延时的50%, 90%, 99%, 99.9%分位值,统计窗口默认10秒([-bvar_dump_interval](http://brpc.baidu.com:8765/flags/bvar_dump_interval)控制),在html下有曲线。
- **latency_cdf**: 是分位值的另一种展现形式,类似histogram,只能在web界面下查看。 - **latency_cdf**: [CDF](https://en.wikipedia.org/wiki/Cumulative_distribution_function)展示分位值, 只能在html下查看。
- **max_latency**: 在web界面下从右到左分别是过去60秒,60分钟,24小时,30天的最大延时。在文本界面下是10秒内([-bvar_dump_interval](http://brpc.baidu.com:8765/flags/bvar_dump_interval)控制)的最大延时。 - **max_latency**: 在html下*从右到左*分别是过去60秒,60分钟,24小时,30天的最大延时。纯文本下是10秒内([-bvar_dump_interval](http://brpc.baidu.com:8765/flags/bvar_dump_interval)控制)的最大延时。
- **qps**: 在web界面下从右到左分别是过去60秒,60分钟,24小时,30天的平均qps。在文本界面下是10秒内([-bvar_dump_interval](http://brpc.baidu.com:8765/flags/bvar_dump_interval)控制)的平均qps。 - **qps**: 在html下从右到左分别是过去60秒,60分钟,24小时,30天的平均qps(Queries Per Second)。纯文本下是10秒内([-bvar_dump_interval](http://brpc.baidu.com:8765/flags/bvar_dump_interval)控制)的平均qps。
- **processing**: 正在处理的请求个数。如果持续不为0(特别是在压力归0后),应考虑程序是否有bug - **processing**: 正在处理的请求个数。在压力归0后若此指标仍持续不为0,server则很有可能bug,比如忘记调用done了或卡在某个处理步骤上了
用户可通过让对应Service实现[brpc::Describable](https://github.com/brpc/brpc/blob/master/src/brpc/describable.h)自定义在/status页面上的描述. 用户可通过让对应Service实现[brpc::Describable](https://github.com/brpc/brpc/blob/master/src/brpc/describable.h)自定义在/status页面上的描述.
......
[/status](http://brpc.baidu.com:8765/status) shows primary statistics of services. They shares the same sources with [/vars](../cn/vars.md) , except that they are grouped by services. [中文版](../cn/status.md)
[/status](http://brpc.baidu.com:8765/status) shows primary statistics of services inside the server. The data sources are same with [/vars](vars.md), but stats are grouped differently.
![img](../images/status.png) ![img](../images/status.png)
Meaning of the fields above: Meanings of the fields above:
- **non_service_error**: the count of errors except the ones raised by the services. For example, it is a *non_service_error* that the client closes connection when the service is processing requests since no response could be written back, while it counts in the *service error* when the connection established internally in the service is broken and the service fails to get reponse from the remote side. - **non_service_error**: number of errors raised outside processing code of the service. For example, the error that server can't write response back due to a broken connection which had been closed by the client, is a *non_service_error* because the service processing already ends. As a contrast, failing to access back-end servers during the processing is an error of the service, not a *non_service_error*. Even if the response written out successfully stands for failure, the error is counted into the service rather than *non_service_error*.
- **connection_count**: The number of connections to the server, excluded the ones connected to remote. - **connection_count**: number of connections to the server from clients, not including number of outward connections which are displayed at /vars/rpc_channel_connection_count.
- **example.EchoService**: Full name of the service, including the package name - **example.EchoService**: Full name of the service, including the package name defined in proto.
- **Echo (EchoRequest) returns (EchoResponse)**: Signature of the method, a services can have multiple methods, click request/response and you can check out the corresponding protobuf message. - **Echo (EchoRequest) returns (EchoResponse)**: Signature of the method. A service can have multiple methods. Click links on request/response to see schemes of the protobuf messages.
- **count**: Number of requests that are succesfully processed. - **count**: Number of requests that are succesfully processed.
- **error**: Number of requests that meet failure. - **error**: Number of requests that are failed to process.
- **latency**: On the web page it shows average latency in the recent *60s/60m/24h/30d* from *right to left*. On plain text page it is the average latency in recent 10s(by default, specified by [-bvar_dump_interval](http://brpc.baidu.com:8765/flags/bvar_dump_interval). - **latency**: average latency in recent *60s/60m/24h/30d* from *right to left* on html, average latency in recent 10s(by default, specified by [-bvar_dump_interval](http://brpc.baidu.com:8765/flags/bvar_dump_interval)) on plain texts.
- **latency_percentiles**: The percentail of latency at 50%, 90%, 99%, 99.9% in 10 seconds(specified by[-bvar_dump_interval](http://brpc.baidu.com:8765/flags/bvar_dump_interval)),it shows adtional historical values on the web page. - **latency_percentiles**: 50%, 90%, 99%, 99.9% percentiles of latency in 10 seconds(specified by[-bvar_dump_interval](http://brpc.baidu.com:8765/flags/bvar_dump_interval)). Curves with historical values are shown on html.
- **latency_cdf**: Anther view of percentiles like histogram,only available on web page. - **latency_cdf**: shows percentiles as [CDF](https://en.wikipedia.org/wiki/Cumulative_distribution_function), only available on html.
- **max_latency**: On the web page it shows the max latency in the recent *60s/60m/24h/30d* from *right to left*. On plain text page it is the max latency in recent 10s(by default, specified by [-bvar_dump_interval](http://brpc.baidu.com:8765/flags/bvar_dump_interval). - **max_latency**: max latency in recent *60s/60m/24h/30d* from *right to left* on html, max latency in recent 10s(by default, specified by [-bvar_dump_interval](http://brpc.baidu.com:8765/flags/bvar_dump_interval)) on plain texts.
- **qps**: On the web page it shows the qps in the recent *60s/60m/24h/30d* from *right to left*. On plain text page it is the qps in recent 10s(by default, specified by [-bvar_dump_interval](http://brpc.baidu.com:8765/flags/bvar_dump_interval). - **qps**: QPS(Queries Per Second) in recent *60s/60m/24h/30d* from *right to left* on html. QPS in recent 10s(by default, specified by [-bvar_dump_interval](http://brpc.baidu.com:8765/flags/bvar_dump_interval)) on plain texts.
- **processing**: The number of requests that is being processed by the service. If this is - **processing**: Number of requests being processed by the service. If this counter can't hit zero when the traffic to the service becomes zero, the server probably has bugs, such as forgetting to call done->Run() or stuck on some processing steps.
You can extends your servcies with [brpc::Describable](https://github.com/brpc/brpc/blob/master/src/brpc/describable.h) to customize /status page. Users may customize descriptions on /status by letting the service implement [brpc::Describable](https://github.com/brpc/brpc/blob/master/src/brpc/describable.h).
```c++ ```c++
class MyService : public XXXService, public brpc::Describable { class MyService : public XXXService, public brpc::Describable {
...@@ -30,6 +32,6 @@ public: ...@@ -30,6 +32,6 @@ public:
}; };
``` ```
An example: For example:
![img](../images/status_2.png) ![img](../images/status_2.png)
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment