Examples for Http Client: [example/http_c++](https://github.com/brpc/brpc/blob/master/example/http_c++/http_client.cpp)
# Create Channel
In order to use`brpc::Channel` to access the HTTP service, `ChannelOptions.protocol` must be specified as `PROTOCOL_HTTP`.
After setting the HTTP protocol, the first parameter of `Channel::Init` can be any valid URL. *Note*: We only use the host and port part inside the URL here in order to save the user from additional parsing work. Other parts of the URL in `Channel::Init` will be discarded.
HTTP has nothing to do with protobuf, so every parameters of `CallMethod` are NULL except `Controller` and `done`, which can be used to issue RPC asynchronously.
`cntl.response_attachment ()` is the response body whose type is `butil :: IOBuf`. Note that converting `IOBuf` to `std :: string` using `to_string()` needs to allocate memory and copy all the content. As a result, if performance comes first, you should use `IOBuf` directly rather than continuous memory.
# POST
The default HTTP Method is GET. You can set the method to POST if needed, and you should append the POST data into `request_attachment()`, which ([butil::IOBuf](https://github.com/brpc/brpc/blob/master/src/butil/iobuf.h)) supports `std :: string` or `char *`
If you need a lot print, we suggest using `butil::IOBufBuilder`, which has the same interface as `std::ostringstream`. It's much simpler and more efficient to print lots of objects using `butil::IOBufBuilder`.
Here's the question, why to pass URL parameter twice (via `set_uri`) instead of using the URL inside `Channel::Init()` ?
For most simple cases, it's a repeat work. But in complex scenes, they are very different in:
- Access multiple servers under a BNS node. At this time `Channel::Init` accepts the BNS node name, the value of `set_uri()` is the whole URL including Host (such as `www.foo.com/index.html?name=value`). As a result, all servers under BNS will see `Host: www.foo.com`. `set_uri()` also takes URL with the path only, such as `/index.html?name=value`. RPC framework will automatically fill the `Host` header using of the target server's ip and port. For example, http server at 10.46.188.39: 8989 will see `Host: 10.46.188.39: 8989`.
- Access the target server via http proxy. At this point `Channel::Init` takes the address of the proxy server, while `set_uri()` takes the URL of the target server.
# Basic Usage
We use `http request` as example (which is the same to `http response`). Here's some basic operations:
Access an HTTP header named `Foo`
```c++
conststd::string*value=cntl->http_request().GetHeader("Foo");// NULL when not exist
```
Set an HTTP header named `Foo`
```c++
cntl->http_request().SetHeader("Foo","value");
```
Access a query named `Foo`
```c++
conststd::string*value=cntl->http_request().uri().GetQuery("Foo");// NULL when not exist
- The field_name of the header is case-insensitive according to [standard](http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2). The framework supports that while leaving the case unchanged.
- If we have multiple headers with the same field_name, according to [standard](http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2), values will be merged together separating by comma (,). Users should figure out how to use this value according to own needs.
- Queries are separated by "&", while key and value are partitioned by "=". Value may be omitted. For example, `key1=value1&key2&key3=value3` is a valid query string, and the value for `key2` is an empty string.
# Debug for HTTP client
Turn on [-http_verbose](http://brpc.baidu.com:8765/flags/http_verbose) so that the framework will print each request and response in stderr. Note that this should only be used for test and debug rather than online cases.
# Error Handle for HTTP
When server returns a non-2xx HTTP status code, the HTTP request is considered to be failed and sets the corresponding ErrorCode:
- All errors are unified as `EHTTP`. If you find `cntl->ErrorCode()` as `EHTTP`, you can check `cntl-> http_response().status_code()` to get a more specific HTTP error. In the meanwhile, HTTP body will be placed inside `cntl->response_attachment()`, you can check for error body such as html or json there.
# Compress Request Body
Call `Controller::set_request_compress_type(brpc::COMPRESS_TYPE_GZIP)` and then the framework will use gzip to compress HTTP body and set `Content-Encoding` to gzip.
# Decompress Response Body
For generality, brpc will not decompress response body automatically. You can do it yourself as the code won't be complicate:
// Now cntl->response_attachment() contains the decompressed data
```
# Continuous Download
When downloading a large file, normally the client needs to wait until the whole file has been loaded into its memory to finish this RPC. In order to leverage the problem of memory growth and RPC resourses, in brpc the client can end its RPC first and then continuously read the rest of the file. Note that it's not HTTP chunked mode as brpc always supports for parsing chunked mode body. This is the solution to allow user the deal with super large body.
Basic usage:
1. Implement ProgressiveReader:
```c++
#include <brpc/progressive_reader.h>
...
class ProgressiveReader {
public:
// Called when one part was read.
// Error returned is treated as *permenant* and the socket where the
// data was read will be closed.
// A temporary error may be handled by blocking this function, which
`OnReadOnePart` is called each time data is read. `OnEndOfMessage` is called each time data has finished or connection has broken. Please refer to comments before implementing.
2. Set `cntl.response_will_be_read_progressively();` before RPC so that brpc knows to end RPC after reading the header part.
3. Call `cntl.ReadProgressiveAttachmentBy(new MyProgressiveReader);` after RPC so that you can use your own implemented object `MyProgressiveReader` . You may delete this object inside `OnEndOfMessage`.
# Continuous Upload
Currently the POST data should be intact so that we do not support large POST body.
# Access Server with Authentication
Generate `auth_data` according to the server's authentication method and then set it into header `Authorization`. This is the same as using curl to add option `-H "Authorization : <auth_data>"`.
brpc uses [butil::IOBuf](https://github.com/brpc/brpc/blob/master/src/butil/iobuf.h) as data structure for attachment storage and HTTP body. It is a non-contiguous zero copy buffer, which has been proved in other projects as excellent performance. The interface of `IOBuf` is similar to `std::string`, but not the same.
If you used the `BufHandle` in Kylin before, you should notice the difference in convenience of `IOBuf`: the former hardly had any encapsulation, leaving the internal structure directly in front of the user. The user must carefully handle the reference count, which is very error prone, leading to lots of bugs.
# What IOBuf can:
- Default constructor doesn't involve copying.
- Explicit copy doesn't change source IOBuf. Only copy the management structure of IOBuf instead of the data.
- Append another IOBuf without copy.
- Append string involves copy.
- Read from/Write into fd.
- Convert to protobuf and vice versa.
- IOBufBuilder可以把IOBuf当std::ostream用。
# What IOBuf can't:
- Used as general storage structure. IOBuf should not keep a long life cycle to prevent multiple memory blocks (8K each) being locked by one IOBuf object.
# Slice
Slice 16 bytes from IOBuf:
```c++
source_buf.cut(&heading_iobuf,16);// cut all bytes of source_buf when its length < 16
```
Remove 16 bytes:
```c++
source_buf.pop_front(16);// Empty source_buf when its length < 16
```
# Concatenate
Append to another IOBuf:
```c++
buf.append(another_buf);// no data copy
```
Append std::string
```c++
buf.append(str);// copy data of str into buf
```
# Parse
Parse protobuf from IOBuf
```c++
IOBufAsZeroCopyInputStreamwrapper(&iobuf);
pb_message.ParseFromZeroCopyStream(&wrapper);
```
Parse IOBuf as user-defined structure
```c++
IOBufAsZeroCopyInputStreamwrapper(&iobuf);
CodedInputStreamcoded_stream(&wrapper);
coded_stream.ReadLittleEndian32(&value);
...
```
# Serialize
Serialize protobuf into IOBuf
```c++
IOBufAsZeroCopyOutputStreamwrapper(&iobuf);
pb_message.SerializeToZeroCopyStream(&wrapper);
```
Append printable data into IOBuf
```c++
IOBufBuilderos;
os<<"anything can be sent to std::ostream";
os.buf();// IOBuf
```
# Print
```c++
std::cout<<iobuf;
std::stringstr=iobuf.to_string();
```
# Performance
IOBuf has excellent performance in general aspects:
LOG_IF(NOTICE,n>10)<<"This log will only be printed when n > 10";
PLOG(FATAL)<<"Fail to call function setting errno";
VLOG(1)<<"verbose log tier 1";
CHECK_GT(1,2)<<"1 can't be greater than 2";
LOG_EVERY_SECOND(INFO)<<"High-frequent logs";
LOG_EVERY_N(ERROR,10)<<"High-frequent logs";
LOG_FIRST_N(INFO,20)<<"Logs that prints for at most 20 times";
LOG_ONCE(WARNING)<<"Logs that only prints once";
```
# DESCRIPTION
Streaming log is the best choice for printing complex objects or template objects. As most objects are complicate, user needs to convert all the fields to string first in order to use `printf` with `%s`. However it's very inconvenient (can't append numbers) and needs lots of temporary memory (caused by string). The solution in C++ is to send the log as a stream to the `std::ostream` object. For example, in order to print object A, we need to implement the following interface:
The signature of the function means to print object `a` to `os` and then return `os`. The return value of `os` enables us to combine binary operator `<<` (left-combine). As a result, `os << a << b << c;` means `operator<<(operator<<(operator<<(os, a), b), c);`. Apparently `operator<<` needs a returning reference to complete this process, which is also called chaining. In languages that don't support operator overloading, you will see a more tedious form, such as `os.print(a).print(b).print(c)`.
You should also use chaining in your own implementation of `operator<<`. In fact, printing a complex object is like DFS a tree: Call `operator<<` on each child node, and then each child node invokes the function on the grandchild node, and so forth. For example, object A has two member variables: B and C. Printing A becomes the process of putting B and C ostream:
This way we don't need to allocate temporary memory since objects are directly passed into the ostream object. Of course, the memory management of ostream itself is another topic.
OK, now we connect the whole printing process by ostream. The most common ostream objects are `std::cout` and `std::cerr`, so objects implement the above function can be directly sent to `std::cout` and `std::cerr`. In other words, if a log stream also inherits ostream, then these objects can be written into log. Streaming log is such a log stream that inherits `std::ostream` to send the object into the log. In the current implementation, the logs are recorded in a thread-local buffer, which will be flushed into screen or ` logging::LogSink` after a complete log record. Of course, the implementation is thread safe.
## LOG
If you have ever used glog before, you should find it easy to start. The log macro is the same as glog. For example, to print a FATAL log (Note that there is no `std::endl`):
| FATAL | FATAL (coredump) | Fatal error. Since most fatal log inside baidu is not fatal actually, it won't trigger coredump directly as glog, unless you turn on [-crash_on_fatal_log](http://brpc.baidu.com:8765/flags/crash_on_fatal_log) |
| ERROR | ERROR | Non-fatal error. |
| WARNING | WARNING | Unusual branches |
| NOTICE | - | Generally you should not use NOTICE as it's intended for important business logs. Make sure to check with other developers. glog doesn't have NOTICE. |
| INFO, TRACE | INFO | Important side effects such as open/close some resources. |
| VLOG(n) | INFO | Detailed log that support multiple layers. |
| DEBUG | INFOVLOG(1) (NDEBUG) | Just for compatibility. Print logs only when `NDEBUG` is not defined. See DLOG/DPLOG/DVLOG for more reference. |
## PLOG
The difference of PLOG and LOG is that it will append error information at the end of log. It's kind of like `%m` in `printf`. Under POSIX environment, the error code is `errno`。
```c++
intfd=open("foo.conf",O_RDONLY);// foo.conf does not exist, errno was set to ENOENT
if(fd<0){
PLOG(FATAL)<<"Fail to open foo.conf";// "Fail to open foo.conf: No such file or directory"
return-1;
}
```
## noflush
If you don't want to flush the log at once, append `noflush`. It's commonly used inside a loop:
The first two LOG(TRACE) doesn't flush the log to the screen. They are recorded inside the thread-local buffer. The third LOG(TRACE) flush all logs into the screen. If there are 3 elements inside items and we don't append `noflush`, the result would be:
```
TRACE: ... Items:
TRACE: ... item1
TRACE: ... item2
TRACE: ... item3
```
After we add `noflush`:
```
TRACE: ... Items: item1 item2 item3
```
The `noflush` feature also support bthread so that we can push lots of logs from the server's bthreads without actually print them (using `noflush`), and flush the whole log at the end of RPC. Note that you should not use `noflush` when implementing an asynchronous method since it will change the underlying bthread, leaving `noflush` out of function.
## LOG_IF
`LOG_IF(log_level, condition)` prints only when condition is true. It's the same as `if (condition) { LOG() << ...; }` with shorter code:
```c++
LOG_IF(NOTICE,n>10)<<"This log will only be printed when n > 10";
```
## XXX_EVERY_SECOND
XXX represents for LOG, LOG_IF, PLOG, SYSLOG, VLOG, DLOG, and so on. These logging macros print log at most once per second. You can use these to check running status inside hotspot area. The first call to this macro prints the log immediately, and costs additional 30ns (caused by gettimeofday) compared to normal LOG.
```c++
LOG_EVERY_SECOND(INFO)<<"High-frequent logs";
```
## XXX_EVERY_N
XXX represents for LOG, LOG_IF, PLOG, SYSLOG, VLOG, DLOG, and so on. These logging macros print log every N times. You can use these to check running status inside hotspot area. The first call to this macro prints the log immediately, and costs an additional atomic operation (relaxed order) compared to normal LOG. This macro is thread safe which means counting from multiple threads is also accurate while glog is not.
```c++
LOG_EVERY_N(ERROR,10)<<"High-frequent logs";
```
## XXX_FIRST_N
XXX represents for LOG, LOG_IF, PLOG, SYSLOG, VLOG, DLOG, and so on. These logging macros print log at most N times. It costs an additional atomic operation (relaxed order) compared to normal LOG before N, and zero cost after.
```c++
LOG_FIRST_N(ERROR,20)<<"Logs that prints for at most 20 times";
```
## XXX_ONCE
XX represents for LOG, LOG_IF, PLOG, SYSLOG, VLOG, DLOG, and so on. These logging macros print log at most once. It's the same as `XXX_FIRST_N(..., 1)`
```c++
LOG_ONCE(ERROR)<<"Logs that only prints once";
```
## VLOG
VLOG(verbose_level) is detail log that support multiple layers. It uses 2 gflags: *--verbose* and *--verbose_module* to control the logging layer you want (Note that glog uses *--v* and *--vmodule*). The log will be printed only when `--verbose` >= `verbose_level`:
```c++
VLOG(0)<<"verbose log tier 0";
VLOG(1)<<"verbose log tier 1";
VLOG(2)<<"verbose log tier 2";
```
When `--verbose=1`, the first 2 log will be printed while the last won't. Module means a file or file path without the extension name, and value of `--verbose_module` will overwrite `--verbose`. For example:
```bash
--verbose=1 --verbose_module="channel=2,server=3"# print VLOG of those with verbose value:
You can set `--verbose` and `--verbose_module` through `google::SetCommandLineOption` dynamically.
VLOG has another form VLOG2, which allows user to specify virtual path:
```c++
// public/foo/bar.cpp
VLOG2("a/b/c",2)<<"being filtered by a/b/c rather than public/foo/bar";
```
> VLOG and VLOG2 also have corresponding VLOG_IF and VLOG2_IF.
## DLOG
All log macros have debug versions, starting with D, such as DLOG, DVLOG. When NDEBUG is defined, these logs will not be printed.
**Do not put important side effects inside the log streams beginning with D.**
*No printing* means that even the parameters are not evaluated. If your parameters have side effects, they won't happend when NDEBUG is defined. For example, `DLOG(FATAL) << foo();` where foo is a function or it changes a dictionary, anyway, it's essential. However, it won't be evaluated when NDEBUG is defined.
## CHECK
Another import variation of logging is `CHECK(expression)`. When expression evaluates to false, it will print a fatal log. It's kind of like `ASSERT` in gtest, and has other form such as CHECK_EQ, CHECK_GT, and so on. When check fails, the message after will be printed.
```c++
CHECK_LT(1,2)<<"This is definitely true, this log will never be seen";
CHECK_GT(1,2)<<"1 can't be greater than 2";
```
Run the above code you should see a fatal log and the calling stack:
```
FATAL: ... Check failed: 1 > 2 (1 vs 2). 1 can't be greater than 2
You **should** use `CHECK_XX` for arithmetic condition so that you can see more detailed information when check failed.
```c++
intx=1;
inty=2;
CHECK_GT(x,y);// Check failed: x > y (1 vs 2).
CHECK(x>y);// Check failed: x > y.
```
Like DLOG, you should NOT include important side effects inside DCHECK.
## LogSink
The default destination of streaming log is the screen. You can change it through `logging::SetLogSink`. Users can inherit LogSink and implement their own output logic. We provide an internal LogSink as an example:
### StringSink
Inherit both LogSink and string. Store log content inside string and mainly aim for unit test. The following case shows a classic usage of StringSink: