http_service.md 21.5 KB
Newer Older
gejun's avatar
gejun committed
1
[中文版](../cn/http_service.md)
2

gejun's avatar
gejun committed
3
This document talks about ordinary HTTP services rather than protobuf services accessible via HTTP. HTTP services in brpc have to declare interfaces with empty request and response in a .proto file. This requirement keeps all service declarations inside proto files rather than scattering in code, configurations, and proto files. Check [http_server.cpp](https://github.com/brpc/brpc/blob/master/example/http_c++/http_server.cpp) for an example.
4

gejun's avatar
gejun committed
5
# URL types
6

gejun's avatar
gejun committed
7
## /ServiceName/MethodName as the prefix
8

gejun's avatar
gejun committed
9 10 11 12 13 14 15 16 17 18
Define a service named `ServiceName`(not including the package name), with a method named `MethodName` and empty request/response, the service will provide http service on `/ServiceName/MethodName` by default.

The reason that request and response can be empty is that the HTTP data is in Controller:

- Header of the http request is in Controller.http_request() and the body is in Controller.request_attachment().
- Header of the http response is in Controller.http_response() and the body is in Controller.response_attachment().

Implementation steps:

1. Add the service declaration in a proto file.
19 20 21 22 23 24 25 26 27 28 29 30

```protobuf
option cc_generic_services = true;
 
message HttpRequest { };
message HttpResponse { };
 
service HttpService {
      rpc Echo(HttpRequest) returns (HttpResponse);
};
```

gejun's avatar
gejun committed
31
2. Implement the service by inheriting the base class generated in .pb.h, which is same as protobuf services.
32 33 34 35 36 37 38 39 40 41 42 43

```c++
class HttpServiceImpl : public HttpService {
public:
    ...
    virtual void Echo(google::protobuf::RpcController* cntl_base,
                      const HttpRequest* /*request*/,
                      HttpResponse* /*response*/,
                      google::protobuf::Closure* done) {
        brpc::ClosureGuard done_guard(done);
        brpc::Controller* cntl = static_cast<brpc::Controller*>(cntl_base);
 
gejun's avatar
gejun committed
44
        // body is plain text
45 46
        cntl->http_response().set_content_type("text/plain");
       
gejun's avatar
gejun committed
47
        // Use printed query string and body as the response.
48 49 50 51 52 53 54 55 56 57 58 59
        butil::IOBufBuilder os;
        os << "queries:";
        for (brpc::URI::QueryIterator it = cntl->http_request().uri().QueryBegin();
                it != cntl->http_request().uri().QueryEnd(); ++it) {
            os << ' ' << it->first << '=' << it->second;
        }
        os << "\nbody: " << cntl->request_attachment() << '\n';
        os.move_to(cntl->response_attachment());
    }
};
```

gejun's avatar
gejun committed
60
3. After adding the implemented instance into the server, the service is accessible via following URLs (Note that the path after `/HttpService/Echo ` is filled into `cntl->http_request().unresolved_path()`, which is always normalized):
61 62 63 64 65 66 67 68 69

| URL                        | Protobuf Method  | cntl->http_request().uri().path() | cntl->http_request().unresolved_path() |
| -------------------------- | ---------------- | --------------------------------- | -------------------------------------- |
| /HttpService/Echo          | HttpService.Echo | "/HttpService/Echo"               | ""                                     |
| /HttpService/Echo/Foo      | HttpService.Echo | "/HttpService/Echo/Foo"           | "Foo"                                  |
| /HttpService/Echo/Foo/Bar  | HttpService.Echo | "/HttpService/Echo/Foo/Bar"       | "Foo/Bar"                              |
| /HttpService//Echo///Foo// | HttpService.Echo | "/HttpService//Echo///Foo//"      | "Foo"                                  |
| /HttpService               | No such method   |                                   |                                        |

gejun's avatar
gejun committed
70
## /ServiceName as the prefix
71

gejun's avatar
gejun committed
72
HTTP services to manage resources may need this kind of URL, such as `/FileService/foobar.txt` represents `./foobar.txt` and `/FileService/app/data/boot.cfg` represents `./app/data/boot.cfg`.
73

gejun's avatar
gejun committed
74
Implementation steps:
75

gejun's avatar
gejun committed
76
1. Use `FileService` as the service name and `default_method` as the method name in the proto file.
77 78 79

```protobuf
option cc_generic_services = true;
gejun's avatar
gejun committed
80

81 82
message HttpRequest { };
message HttpResponse { };
gejun's avatar
gejun committed
83

84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106
service FileService {
      rpc default_method(HttpRequest) returns (HttpResponse);
}
```

2. Implement the service.

```c++
class FileServiceImpl: public FileService {
public:
    ...
    virtual void default_method(google::protobuf::RpcController* cntl_base,
                                const HttpRequest* /*request*/,
                                HttpResponse* /*response*/,
                                google::protobuf::Closure* done) {
        brpc::ClosureGuard done_guard(done);
        brpc::Controller* cntl = static_cast<brpc::Controller*>(cntl_base);
        cntl->response_attachment().append("Getting file: ");
        cntl->response_attachment().append(cntl->http_request().unresolved_path());
    }
};
```

gejun's avatar
gejun committed
107
3. After adding the implemented instance into the server, the service is accessible via following URLs (the path after `/FileService` is filled in `cntl->http_request().unresolved_path()`, which is always normalized):
108 109 110 111 112 113 114 115

| URL                             | Protobuf Method            | cntl->http_request().uri().path() | cntl->http_request().unresolved_path() |
| ------------------------------- | -------------------------- | --------------------------------- | -------------------------------------- |
| /FileService                    | FileService.default_method | "/FileService"                    | ""                                     |
| /FileService/123.txt            | FileService.default_method | "/FileService/123.txt"            | "123.txt"                              |
| /FileService/mydir/123.txt      | FileService.default_method | "/FileService/mydir/123.txt"      | "mydir/123.txt"                        |
| /FileService//mydir///123.txt// | FileService.default_method | "/FileService//mydir///123.txt//" | "mydir/123.txt"                        |

gejun's avatar
gejun committed
116
## Restful URL
117

gejun's avatar
gejun committed
118
brpc supports specifying a URL for each method in a service. The API is as follows:
119 120 121

```c++
// If `restful_mappings' is non-empty, the method in service can
gejun's avatar
gejun committed
122
// be accessed by the specified URL rather than /ServiceName/MethodName.
123
// Mapping rules: "PATH1 => NAME1, PATH2 => NAME2 ..."
gejun's avatar
gejun committed
124
// where `PATH' is a valid HTTP path and `NAME' is the method name.
125 126 127 128 129
int AddService(google::protobuf::Service* service,
               ServiceOwnership ownership,
               butil::StringPiece restful_mappings);
```

gejun's avatar
gejun committed
130
`QueueService` defined below contains several HTTP methods. If the service is added into the server normally, it's accessible via URLs like `/QueueService/start` and ` /QueueService/stop`.
131 132 133 134 135 136 137 138 139 140

```protobuf
service QueueService {
    rpc start(HttpRequest) returns (HttpResponse);
    rpc stop(HttpRequest) returns (HttpResponse);
    rpc get_stats(HttpRequest) returns (HttpResponse);
    rpc download_data(HttpRequest) returns (HttpResponse);
};
```

gejun's avatar
gejun committed
141
By specifying the 3rd parameter `restful_mappings` to `AddService`, the URL can be customized:
142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162

```c++
if (server.AddService(&queue_svc,
                      brpc::SERVER_DOESNT_OWN_SERVICE,
                      "/v1/queue/start   => start,"
                      "/v1/queue/stop    => stop,"
                      "/v1/queue/stats/* => get_stats") != 0) {
    LOG(ERROR) << "Fail to add queue_svc";
    return -1;
}
 
if (server.AddService(&queue_svc,
                      brpc::SERVER_DOESNT_OWN_SERVICE,
                      "/v1/*/start   => start,"
                      "/v1/*/stop    => stop,"
                      "*.data        => download_data") != 0) {
    LOG(ERROR) << "Fail to add queue_svc";
    return -1;
}
```

gejun's avatar
gejun committed
163
There are 3 mappings separated by comma in the 3rd parameter (which is a string spanning 3 lines) to the `AddService`. Each mapping tells brpc to call the method at right side of the arrow if the left side matches the URL. The asterisk in `/v1/queue/stats/*` matches any string.
164

gejun's avatar
gejun committed
165
More about mapping rules:
166

gejun's avatar
gejun committed
167 168 169 170 171 172 173
- Multiple paths can be mapped to a same method.
- Both HTTP and protobuf services are supported.
- Un-mapped methods are still accessible via `/ServiceName/MethodName`. Mapped methods are **not** accessible via `/ServiceName/MethodName` anymore.
- `==>` and ` ===>` are both OK, namely extra spaces at the beginning or the end, extra slashes, extra commas at the end, are all accepted.
- Pattern `PATH` and `PATH/*` can coexist.
- Support suffix matching: characters can appear after the asterisk.
- At most one asterisk is allowed in a path.
174

gejun's avatar
gejun committed
175
The path after asterisk can be obtained by `cntl.http_request().unresolved_path()`, which is always normalized, namely no slashes at the beginning or the end, and no repeated slashes in the middle. For example:
176 177 178 179 180 181 182

![img](../images/restful_1.png)

or:

![img](../images/restful_2.png)

gejun's avatar
gejun committed
183
in which unresolved_path are both `foo/bar`. The extra slashes at the left, the right, or the middle are removed.
184 185 186

Note that `cntl.http_request().uri().path()` is not ensured to be normalized, which is `"//v1//queue//stats//foo///bar//////"` and `"//vars///foo////bar/////"` respectively in the above example.

gejun's avatar
gejun committed
187
The built-in service page of `/status` shows customized URLs after the methods, in form of `@URL1 @URL2` ...
188 189 190 191 192 193 194

![img](../images/restful_3.png)

# HTTP Parameters

## HTTP headers

gejun's avatar
gejun committed
195 196 197
HTTP headers are a series of key/value pairs, some of them are defined by the HTTP specification, while others are free to use.

Query strings are also key/value pairs. Differences between HTTP headers and query strings:
198

gejun's avatar
gejun committed
199 200
* Although operations on HTTP headers are accurately defined by the http specification, but http headers cannot be modified directly from an address bar, they are often used for passing parameters of a protocol or framework.
* Query strings is part of the URL and **often** in form of `key1=value1&key2=value2&...`, which is easy to read and modify. They're often used for passing application-level parameters. However format of query strings is not defined in HTTP spec, just a convention.
201 202 203 204 205 206 207 208 209 210 211 212 213

```c++
// Get value for header "User-Agent" (case insensitive)
const std::string* user_agent_str = cntl->http_request().GetHeader("User-Agent");
if (user_agent_str != NULL) {  // has the header
    LOG(TRACE) << "User-Agent is " << *user_agent_str;
}
...
 
// Add a header "Accept-encoding: gzip" (case insensitive)
cntl->http_response().SetHeader("Accept-encoding", "gzip");
// Overwrite the previous header "Accept-encoding: deflate"
cntl->http_response().SetHeader("Accept-encoding", "deflate");
gejun's avatar
gejun committed
214
// Append value to the previous header so that it becomes
215 216 217 218 219 220
// "Accept-encoding: deflate,gzip" (values separated by comma)
cntl->http_response().AppendHeader("Accept-encoding", "gzip");
```

## Content-Type

gejun's avatar
gejun committed
221
`Content-type` is a frequently used header for storing type of the HTTP body, and specially processed in brpc and accessible by `cntl->http_request().content_type()` . As a correspondence, `cntl->GetHeader("Content-Type")` returns nothing.
222 223 224 225 226 227 228 229 230 231 232

```c++
// Get Content-Type
if (cntl->http_request().content_type() == "application/json") {
    ...
}
...
// Set Content-Type
cntl->http_response().set_content_type("text/html");
```

gejun's avatar
gejun committed
233
If the RPC fails (`Controller` has been `SetFailed`), the framework overwrites `Content-Type` with `text/plain` and sets the response body with `Controller::ErrorText()`.
234 235 236

## Status Code

gejun's avatar
gejun committed
237
Status code is a special field in HTTP response to store processing result of the http request. Possible values are defined in [http_status_code.h](https://github.com/brpc/brpc/blob/master/src/brpc/http_status_code.h).
238 239 240 241 242 243 244 245 246 247 248 249

```c++
// Get Status Code
if (cntl->http_response().status_code() == brpc::HTTP_STATUS_NOT_FOUND) {
    LOG(FATAL) << "FAILED: " << controller.http_response().reason_phrase();
}
...
// Set Status code
cntl->http_response().set_status_code(brpc::HTTP_STATUS_INTERNAL_SERVER_ERROR);
cntl->http_response().set_status_code(brpc::HTTP_STATUS_INTERNAL_SERVER_ERROR, "My explanation of the error...");
```

gejun's avatar
gejun committed
250
For example, following code implements redirection with status code 302:
251 252 253 254 255 256 257 258 259 260

```c++
cntl->http_response().set_status_code(brpc::HTTP_STATUS_FOUND);
cntl->http_response().SetHeader("Location", "http://bj.bs.bae.baidu.com/family/image001(4979).jpg");
```

![img](../images/302.png)

## Query String

gejun's avatar
gejun committed
261
As mentioned in above [HTTP headers](#http-headers), query strings are interpreted in common convention, whose form is `key1=value1&key2=value2&…`. Keys without values are acceptable as well and accessible by `GetQuery` which returns an empty string. Such keys are often used as boolean flags. Full API are defined in [uri.h](https://github.com/brpc/brpc/blob/master/src/brpc/uri.h).
262 263 264 265 266 267

```c++
const std::string* time_value = cntl->http_request().uri().GetQuery("time");
if (time_value != NULL) {  // the query string is present
    LOG(TRACE) << "time = " << *time_value;
}
gejun's avatar
gejun committed
268

269 270 271 272
...
cntl->http_request().uri().SetQuery("time", "2015/1/2");
```

gejun's avatar
gejun committed
273
# Debugging
274 275 276

Turn on [-http_verbose](http://brpc.baidu.com:8765/flags/http_verbose) to print contents of all http requests and responses to stderr. Note that this should only be used for debugging rather than online services.

gejun's avatar
gejun committed
277 278 279 280 281
# Compress the response body

HTTP services often compress http bodies to reduce transmission latency of web pages and speed up the presentations to end users.

Call `Controller::set_response_compress_type(brpc::COMPRESS_TYPE_GZIP)` to **try to** compress the http body with gzip. "Try to" means the compression may not happen in following conditions:
282

gejun's avatar
gejun committed
283
* The request does not set `Accept-encoding` or the value does not contain "gzip". For example, curl does not support compression without option `--compressed`, in which case the server always returns uncompressed results.
284

gejun's avatar
gejun committed
285
* Body size is less than the bytes specified by -http_body_compress_threshold (512 by default). gzip is not a very fast compression algorithm. When the body is small, the delay added by compression may be larger than the time saved by network transmission. No compression when the body is relatively small is probably a better choice.
286

gejun's avatar
gejun committed
287 288 289
  | Name                         | Value | Description                              | Defined At                            |
  | ---------------------------- | ----- | ---------------------------------------- | ------------------------------------- |
  | http_body_compress_threshold | 512   | Not compress http body when it's less than so many bytes. | src/brpc/policy/http_rpc_protocol.cpp |
290

gejun's avatar
gejun committed
291 292 293
# Decompress the request body

Due to generality, brpc does not decompress request bodies automatically, but users can do the job by themselves as follows:
294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311

```c++
#include <brpc/policy/gzip_compress.h>
...
const std::string* encoding = cntl->http_request().GetHeader("Content-Encoding");
if (encoding != NULL && *encoding == "gzip") {
    butil::IOBuf uncompressed;
    if (!brpc::policy::GzipDecompress(cntl->request_attachment(), &uncompressed)) {
        LOG(ERROR) << "Fail to un-gzip request body";
        return;
    }
    cntl->request_attachment().swap(uncompressed);
}
// cntl->request_attachment() contains the data after decompression
```

# Turn on HTTPS

gejun's avatar
gejun committed
312
Update openssl to the latest version before turning on HTTPS, since older versions of openssl may have severe security problems and support less encryption algorithms, which is against with the purpose of using SSL. Setup `ServerOptions.ssl_options` to turn on HTTPS.
313 314 315 316 317 318 319

```c++
// Certificate structure
struct CertInfo {
    // Certificate in PEM format.
    // Note that CN and alt subjects will be extracted from the certificate,
    // and will be used as hostnames. Requests to this hostname (provided SNI
gejun's avatar
gejun committed
320
    // extension supported) will be encrypted using this certifcate.
321 322 323 324 325 326
    // Supported both file path and raw string
    std::string certificate;

    // Private key in PEM format.
    // Supported both file path and raw string based on prefix:
    std::string private_key;
gejun's avatar
gejun committed
327

328 329 330 331 332 333 334 335 336 337
    // Additional hostnames besides those inside the certificate. Wildcards
    // are supported but it can only appear once at the beginning (i.e. *.xxx.com).
    std::vector<std::string> sni_filters;
};

struct SSLOptions {
    // Default certificate which will be loaded into server. Requests
    // without hostname or whose hostname doesn't have a corresponding
    // certificate will use this certificate. MUST be set to enable SSL.
    CertInfo default_cert;
gejun's avatar
gejun committed
338

339 340 341 342 343 344 345 346 347 348 349 350 351 352 353
    // Additional certificates which will be loaded into server. These
    // provide extra bindings between hostnames and certificates so that
    // we can choose different certificates according to different hostnames.
    // See `CertInfo' for detail.
    std::vector<CertInfo> certs;

    // When set, requests without hostname or whose hostname can't be found in
    // any of the cerficates above will be dropped. Otherwise, `default_cert'
    // will be used.
    // Default: false
    bool strict_sni;
 
    // ... Other options
};
```
gejun's avatar
gejun committed
354
Other options include: cipher suites (recommend using `ECDHE-RSA-AES256-GCM-SHA384` which is the default suite used by chrome, and one of the safest suites. The drawback is more CPU cost), session reuse and so on. Read [server.h](https://github.com/brpc/brpc/blob/master/src/brpc/server.h) for more information.
355

gejun's avatar
gejun committed
356
After turning on HTTPS, the service is still accessible by HTTP from the same port. The server identifies whether the request is HTTP or HTTPS automatically, and tell the result to users by `Controller::is_ssl()`. As you can see, the HTTPS in brpc is more like supporting an additional protocol, rather than providing an encrypted communication channel.
357 358 359

# Performance

gejun's avatar
gejun committed
360
Productions without extreme performance requirements tend to use HTTP protocol, especially mobile products. Thus we put great emphasis on implementation qualities of HTTP. To be more specific:
361

gejun's avatar
gejun committed
362 363 364 365
- Use [http parser](https://github.com/brpc/brpc/blob/master/src/brpc/details/http_parser.h) of node.js to parse http messages, which is a lightweight, well-written, and extensively used implementation.
- Use [rapidjson](https://github.com/miloyip/rapidjson) to parse json, which is a json library focuses on performance.
- In the worst case, the time complexity of parsing http requests is still O(N), where N is byte size of the request. As a contrast, parsing code that requires the http request to be complete, may cost O(N^2) time in the worst case. This feature is very helpful since many HTTP requests are large.
- Processing HTTP messages from different clients is highly concurrent, even a pretty complicated http message does not block responding other clients. It's difficult to achieve this for other RPC implementations and http servers often based on [single-threaded reactor](threading_overview.md#single-threaded-reactor).
366 367 368

# Progressive sending

gejun's avatar
gejun committed
369 370 371 372 373 374 375 376 377 378
brpc server is capable of sending large or infinite sized body, in following steps:

1. Call `Controller::CreateProgressiveAttachment()` to create a body that can be written progressively. The returned `ProgressiveAttachment` object should be managed by `intrusive_ptr`
  ```c++
  #include <brpc/progressive_attachment.h>
  ...
  butil::intrusive_ptr<brpc::ProgressiveAttachment> pa (cntl->CreateProgressiveAttachment());
  ```

2. Call `ProgressiveAttachment::Write()` to send the data.
379

gejun's avatar
gejun committed
380 381 382 383
   * If the write occurs before running of the server-side done, the sent data is cached until the done is called.
   * If the write occurs after running of the server-side done, the sent data is written out in chunked mode immediately.

3. After usage, destruct all `butil::intrusive_ptr<brpc::ProgressiveAttachment>` to release related resources.
384 385 386

# Progressive receiving

gejun's avatar
gejun committed
387
Currently brpc server doesn't support calling the service callback once header part in the http request is parsed. In other words, brpc server is not suitable for receiving large or infinite sized body.
388 389 390

# FAQ

gejun's avatar
gejun committed
391 392 393
### Q: The nginx before brpc encounters final fail

The error is caused by that brpc server closes the http connection directly without sending a response.
394

gejun's avatar
gejun committed
395
brpc server supports a variety of protocols on the same port. When a request is failed to be parsed in HTTP, it's hard to tell that the request is definitely in HTTP. If the request is very likely to be one, the server sends HTTP 400 errors and closes the connection. However, if the error is caused HTTP method(at the beginning) or ill-formed serialization (may be caused by bugs at the HTTP client), the server still closes the connection without sending a response, which leads to "final fail" at nginx.
396

gejun's avatar
gejun committed
397
Solution: When using Nginx to forward traffic, set `$HTTP_method` to allowed HTTP methods or simply specify the HTTP method in `proxy_method`.
398 399 400

### Q: Does brpc support http chunked mode

gejun's avatar
gejun committed
401
Yes.
402

gejun's avatar
gejun committed
403
### Q: Why do HTTP requests containing BASE64 encoded query string fail to parse sometimes?
404

gejun's avatar
gejun committed
405
According to the [HTTP specification](http://tools.ietf.org/html/rfc3986#section-2.2), following characters need to be encoded with `%`.
406 407 408 409 410 411 412 413 414 415

```
       reserved    = gen-delims / sub-delims

       gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"

       sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
                   / "*" / "+" / "," / ";" / "="
```

gejun's avatar
gejun committed
416
Base64 encoded string may end with `=` which is a reserved character (take `?wi=NDgwMDB8dGVzdA==&anothorkey=anothervalue` as an example). The strings may be parsed successfully, or may be not, depending on the implementation which should not be assumed in principle.
417

gejun's avatar
gejun committed
418
One solution is to remove the trailing `=` which does not affect the [Base64 decoding](http://en.wikipedia.org/wiki/Base64#Padding). Another method is to [percent-encode](https://en.wikipedia.org/wiki/Percent-encoding) the URI, and do percent-decoding before Base64 decoding.