Commit 8199994e authored by Ge Jun's avatar Ge Jun

Polish bvar.md

parent 9d0717f9
...@@ -10,23 +10,7 @@ ...@@ -10,23 +10,7 @@
![img](../images/bvar_perf.png) ![img](../images/bvar_perf.png)
# 监控bvar # 新增bvar
下图是监控bvar的示意图:
![img](../images/bvar_flow.png)
其中:
- APP代表用户服务,使用bvar API定义监控各类指标。
- bvar定期把被监控的项目打入$PWD/monitor/目录下的文件(用log指代)。此处的log和普通的log的不同点在于是bvar导出是覆盖式的,而不是添加式的。
- 监控系统(用noah指代)收集导出的文件,汇总至全局并生成曲线。
建议APP做到如下监控要求:
- **Error**: 系统中可能出现的error个数
- **Latency**: 系统对外的每个RPC接口的latency(平均和分位值),系统依赖的每个后台的每个RPC接口的latency
- **QPS**: 系统对外的每个RPC接口的QPS信息,系统依赖的每个后台的每个RPC接口的QPS信息
增加C++ bvar的方法请看[快速介绍](bvar_c++.md#quick-introduction). bvar默认统计了进程、系统的一些变量,以process\_, system\_等开头,比如: 增加C++ bvar的方法请看[快速介绍](bvar_c++.md#quick-introduction). bvar默认统计了进程、系统的一些变量,以process\_, system\_等开头,比如:
...@@ -70,7 +54,8 @@ iobuf_block_memory : 729088 ...@@ -70,7 +54,8 @@ iobuf_block_memory : 729088
iobuf_newbigview_second : 10 iobuf_newbigview_second : 10
``` ```
打开bvar的[dump功能](bvar_c++.md#export-all-variables)以导出所有的bvar到文件,格式就入上文一样,每行是一对"名字:值"。打开dump功能后应检查monitor/下是否有数据,比如: # 监控bvar
打开bvar的[dump功能](bvar_c++.md#export-all-variables)以导出所有的bvar到文件,格式就入上文一样,每行是一对"名字:值"。打开dump功能后应检查monitor/目录下是否有数据,比如:
``` ```
$ ls monitor/ $ ls monitor/
...@@ -83,13 +68,13 @@ process_time_system : 0.380942 ...@@ -83,13 +68,13 @@ process_time_system : 0.380942
process_time_user : 0.741887 process_time_user : 0.741887
process_username : "gejun" process_username : "gejun"
``` ```
每次导出都会覆盖之前的文件,这与普通log添加在后面是不同的。
监控系统会把定期把单机导出数据汇总到一起,并按需查询。这里以百度内的noah为例,bvar定义的变量会出现在noah的指标项中,用户可勾选并查看历史曲线。 监控系统应定期搜集每台单机导出的数据,并把它们汇总到一起。这里以百度内的noah为例,bvar定义的变量会出现在noah的指标项中,用户可勾选并查看历史曲线。
![img](../images/bvar_noah2.png) ![img](../images/bvar_noah2.png)
![img](../images/bvar_noah3.png) ![img](../images/bvar_noah3.png)
# bvar导出到其它监控系统格式 # 导出到Prometheus
bvar已支持的其它监控系统格式有[Prometheus](https://prometheus.io)。将Prometheus的抓取url地址的路径设置为`/brpc_metrics`即可,例如brpc server跑在本机的8080端口,则抓取url配置为`127.0.0.1:8080/brpc_metrics` [Prometheus](https://prometheus.io)的抓取url地址的路径设置为`/brpc_metrics`即可,例如brpc server跑在本机的8080端口,则抓取url配置为`127.0.0.1:8080/brpc_metrics`
...@@ -10,23 +10,7 @@ Following graph compares overhead of bvar, atomics, static UbMonitor, dynamic Ub ...@@ -10,23 +10,7 @@ Following graph compares overhead of bvar, atomics, static UbMonitor, dynamic Ub
![img](../images/bvar_perf.png) ![img](../images/bvar_perf.png)
# Monitor bvar # Adding new bvar
Following graph demonstrates how bvars in applications are monitored.
![img](../images/bvar_flow.png)
In which:
- APP means user's application which uses bvar API to record all sorts of metrics.
- bvar periodically prints exposed bvars into a file (represented by "log") under directory $PWD/monitor/ . The "log" is different from an ordinary log that it's overwritten by newer content rather than concatenated.
- The monitoring system collects dumped files (represented by noah), aggregates the data inside and plots curves.
The APP is recommended to meet following requirements:
- **Error**: Number of every kind of error that may occur.
- **Latency**: latencies(average/percentile) of each public RPC interface, latencies of each RPC to back-end servers.
- **QPS**: QPS of each public RPC interface, QPS of each RPC to back-end servers.
Read [Quick introduction](bvar_c++.md#quick-introduction) to know how to add bvar in C++. bvar already provides stats of many process-level and system-level variables by default, which are prefixed with `process_` and `system_`, such as: Read [Quick introduction](bvar_c++.md#quick-introduction) to know how to add bvar in C++. bvar already provides stats of many process-level and system-level variables by default, which are prefixed with `process_` and `system_`, such as:
...@@ -69,7 +53,9 @@ iobuf_block_count_hit_tls_threshold : 0 ...@@ -69,7 +53,9 @@ iobuf_block_count_hit_tls_threshold : 0
iobuf_block_memory : 729088 iobuf_block_memory : 729088
iobuf_newbigview_second : 10 iobuf_newbigview_second : 10
``` ```
New exported files overwrite previous files, which is different from regular logs which append new data.
# Monitoring bvar
Turn on [dump feature](bvar_c++.md#export-all-variables) of bvar to export all exposed bvars to files, which are formatted just like above texts: each line is a pair of "name" and "value". Check if there're data under $PWD/monitor/ after enabling dump, for example: Turn on [dump feature](bvar_c++.md#export-all-variables) of bvar to export all exposed bvars to files, which are formatted just like above texts: each line is a pair of "name" and "value". Check if there're data under $PWD/monitor/ after enabling dump, for example:
``` ```
...@@ -84,12 +70,12 @@ process_time_user : 0.741887 ...@@ -84,12 +70,12 @@ process_time_user : 0.741887
process_username : "gejun" process_username : "gejun"
``` ```
The monitoring system should combine data on every single machine periodically and provide on-demand queries. Take the "noah" system inside Baidu as an example, variables defined by bvar appear as metrics in noah, which can be checked by users to view historical curves. The monitoring system should combine data on every single machine periodically and merge them together to provide on-demand queries. Take the "noah" system inside Baidu as an example, variables defined by bvar appear as metrics in noah, which can be checked by users to view historical curves.
![img](../images/bvar_noah2.png) ![img](../images/bvar_noah2.png)
![img](../images/bvar_noah3.png) ![img](../images/bvar_noah3.png)
# Dump to the format of other monitoring system # Export to Prometheus
Currently monitoring system supported by bvar is [Prometheus](https://prometheus.io). All you need to do is to set the path in scraping target url to `/brpc_metrics`. For example, if brpc server is running on localhost:8080, the scraping target should be `127.0.0.1:8080/brpc_metrics`. To export to [Prometheus](https://prometheus.io), set the path in scraping target url to `/brpc_metrics`. For example, if brpc server is running on localhost:8080, the scraping target should be `127.0.0.1:8080/brpc_metrics`.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment