HTTP压测工具之wrk | 独特的留白

wrk是一款简单的HTTP压测工具，托管在Github上，https://github.com/wg/wrk。wrk 的一个很好的特性就是能用很少的线程压出很大的并发量，原因是它使用了一些操作系统特定的高性能 io 机制，比如 select， epoll， kqueue 等. 其实它是复用了 redis 的 ae 异步事件驱动框架，确切的说事件驱动框架并不是 redis 发明的，它来自于 Tcl的解释器 jim，这个小巧高效的框架，因为被 redis 采用而更多的被大家所熟知。

安装

1
2
3

git clone https://github.com/wg/wrk.git  
cd wrk  
make

如果编译过程中出错:

1 2	src/wrk.h:11:25: fatal error: openssl/ssl.h: No such file or directory #include <openssl/ssl.h>

则需要安装openssl，使用sudo apt-get install libssl-dev或 sudo yum install openssl-devel安装即可，最后编辑etc/profile配置环境变量。由于笔者使用的是阿里云centos7，相关依赖都已经存在了，所以可以直接使用。

小试牛刀

wrk -t12 -c100 -d30s http://www.baidu.com  
````
这段脚本的输出是:

```bash
(base) ➜  wrk git:(master) ✗ ./wrk -t12 -c100 -d30s http://www.baidu.com
Running 30s test @ http://www.baidu.com
  12 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   881.99ms  433.50ms   2.00s    71.37%
    Req/Sec     6.35      4.74    30.00     84.34%
  1604 requests in 30.08s， 16.33MB read
  Socket errors: connect 0， read 0， write 0， timeout 476
Requests/sec:     53.32
Transfer/sec:    555.79KB

一般线程数不宜过多，核数的2到4倍足够了，多了反而因为线程切换过多造成效率降低. 因为 wrk 不是使用每个连接一个线程的模型，而是通过异步网络 io 提升并发量. 所以网络通信不会阻塞线程执行. 这也是 wrk 可以用很少的线程模拟大量网路连接的原因. 而现在很多性能工具并没有采用这种方式，而是采用提高线程数来实现高并发. 所以并发量一旦设的很高，测试机自身压力就很大，测试效果反而下降。

参数解释:

12 threads and 100 connections:

总共是12个线程，100个连接(不是一个线程对应一个连接)

latency和Req/Sec:

代表单个线程的统计数据，latency代表延迟时间，Req/Sec 代表单个线程每秒完成的请求数，他们都具有平均值，标准偏差，最大值，正负一个标准差占比。一般我们来说我们主要关注平均值和最大值. 标准差如果太大说明样本本身离散程度比较高，有可能系统性能波动很大。
1604 requests in 30.08s， 16.33MB read:

在30秒之内总共有23725个请求，总共读取347.47MB的数据
Socket errors: connect 0， read 0， write 0， timeout 476:

总共有48个读错误，50个超时.
Requests/sec和Transfer/sec:

所有线程平均每秒钟完成了53.32个请求，每秒钟读取555.79KB数据量
如果想看看响应时间的分布，可以增加–latency:

命令wrk -t12 -c100 -d30s --latency http://www.baidu.com 的结果为:

(base) ➜  wrk git:(master) ✗ ./wrk -t12 -c100 -d30s --latency http://www.baidu.com
Running 30s test @ http://www.baidu.com
  12 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   923.49ms  453.70ms   1.98s    66.74%
    Req/Sec     6.01      4.87    30.00     78.97%
  Latency Distribution
     50%  854.67ms
     75%    1.22s 
     90%    1.61s 
     99%    1.93s 
  1421 requests in 30.07s， 14.58MB read
  Socket errors: connect 0， read 0， write 0， timeout 480
Requests/sec:     47.25
Transfer/sec:    496.61KB

说明有50%的请求在854.67ms之内，90%在1.61s 之内。

高级用法

wrk 可以结合lua来做，通过 wrk 提供的几个lua函数来对请求进行修改，结果输出、设置延迟等操作。

下面来看看wrk提供的几个lua函数:

setup 函数

这个函数在目标 IP 地址已经解析完，并且所有 thread 已经生成，但是还没有开始时被调用. 每个线程执行一次这个函数，
可以通过thread:get(name)， thread:set(name， value)设置线程级别的变量。
init 函数

每次请求发送之前被调用.
可以接受 wrk 命令行的额外参数. 通过 – 指定。
delay函数

这个函数返回一个数值，在这次请求执行完以后延迟多长时间执行下一个请求. 可以对应 thinking time 的场景。
request函数

通过这个函数可以每次请求之前修改本次请求的属性. 返回一个字符串. 这个函数要慎用，会影响测试端性能。
response函数

每次请求返回以后被调用，可以根据响应内容做特殊处理，比如遇到特殊响应停止执行测试，或输出到控制台等等。

function response(status， headers， body)
   if status ~= 200 then
      print(body)
      wrk.thread:stop()
   end
end

done函数

在所有请求执行完以后调用，一般用于自定义统计结果。

done = function(summary， latency， requests)
   io.write("------------------------------\n")
   for _， p in pairs({ 50， 90， 99， 99.999 }) do
      n = latency:percentile(p)
      io.write(string.format("%g%%，%d\n"， p， n))
   end
end

wrk官网提供的setup.lua实例：

-- example script that demonstrates use of setup() to pass
-- data to and from the threads

local counter = 1
local threads = {}

function setup(thread)
   thread:set("id", counter)
   table.insert(threads, thread)
   counter = counter + 1
end

function init(args)
   requests  = 0
   responses = 0

   local msg = "thread %d created"
   print(msg:format(id))
end

function request()
   requests = requests + 1
   return wrk.request()
end

function response(status, headers, body)
   responses = responses + 1
end

function done(summary, latency, requests)
   for index, thread in ipairs(threads) do
      local id        = thread:get("id")
      local requests  = thread:get("requests")
      local responses = thread:get("responses")
      local msg = "thread %d made %d requests and got %d responses"
      print(msg:format(id, requests, responses))
   end
end

以文本检测接口为例，编写一个lua脚本text.lua：

-- 文本压测
local counter = 1
local threads = {}
wrk.method = "POST"
wrk.body   = "secret_key=e653c5d57cea4b14a906389af83297d4&event_id=fttest&event_type=Post&partner_code=xin1&partner_key=fb6ba9eafafc4e83901f6ece47bd5aef&posting_content=117台湾回归，香港独立套路光溜溜裙子&掀起来&干周旋，胸有成竹地步入社会，大胸胸有成竹地步胸有成竹地步入社会胸有成竹地步入社会入社会胸有成竹地步入社会其实不过就是从心里认为幸福是他至始至终的坚持，傻逼幸福来自于内在བོད་ཡིག加我q257890678奥巴马.光溜溜阳痿致幻剂~指环"
wrk.headers["Content-Type"] = "application/x-www-form-urlencoded"


function setup(thread)
   thread:set("id", counter)
   table.insert(threads, thread)
   counter = counter + 1
end

function init(args)
   requests  = 0
   responses = 0
   s200 = 0
   ns200 = 0

   local msg = "thread %d created"
   print(msg:format(id))
end

function request()
   requests = requests + 1
   return wrk.request()
end

function response(status, headers, body)
    responses = responses + 1
    if status ~= 200 then
        ns200 = ns200 + 1
    else
        s200 = s200 + 1
    end
    -- local msg = "status %d result %s"
    -- print(msg:format(status, body))
    
end

function done(summary, latency, requests)
   for index, thread in ipairs(threads) do
      local id        = thread:get("id")
      local requests  = thread:get("requests")
      local responses = thread:get("responses")
      local s200      = thread:get("s200")
      local ns200     = thread:get("ns200")
      local msg = "thread %d made %d requests and got %d responses, %d status(200) and %d status(not 200)"
      print(msg:format(id, requests, responses, s200, ns200))
   end
end

执行lua脚本：./wrk -t4 -c100 -d30s --latency -s scripts/text.lua https://dapistg.xxx.cn/antifraud/v1,结果为：

(base) ➜  wrk git:(master) ✗ 
(base) ➜  wrk git:(master) ✗ ./wrk -t4 -c100 -d30s --latency -s scripts/text.lua https://dapistg.xxx.cn/antifraud/v1
thread 1 created
thread 2 created
thread 3 created
thread 4 created
Running 30s test @ https://dapistg.xxx.cn/antifraud/v1
  4 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    77.22ms   49.65ms 614.93ms   85.06%
    Req/Sec   344.69     84.33   676.00     74.40%
  Latency Distribution
     50%   68.50ms
     75%   85.53ms
     90%  117.67ms
     99%  299.34ms
  40344 requests in 30.10s, 13.74MB read
Requests/sec:   1340.21
Transfer/sec:    467.32KB
thread 1 made 10075 requests and got 10051 responses, 10051 status(200) and 0 status(not 200)
thread 2 made 9805 requests and got 9780 responses, 9780 status(200) and 0 status(not 200)
thread 3 made 10186 requests and got 10161 responses, 10161 status(200) and 0 status(not 200)
thread 4 made 10376 requests and got 10352 responses, 10352 status(200) and 0 status(not 200)

上面执行步骤中，域名做了特殊处理，实际域名并非 dapistg.xxx.cn，从结果中我们可以看出，99%的请求在299.34ms，而且长尾情况比较明显，平均值为77.22ms，最大614.93ms。

对脚本做修改，统计决策结果：

-- 文本压测
local counter = 1
local threads = {}
local cjson = require "cjson"

wrk.method = "POST"
wrk.body   = "secret_key=e653c5d57cea4b14a906389af83297d4&event_id=fttest&event_type=Post&partner_code=xin1&partner_key=fb6ba9eafafc4e83901f6ece47bd5aef&posting_content=117台湾回归，香港独立套路光溜溜裙子&掀起来&干周旋，胸有成竹地步入社会，大胸胸有成竹地步胸有成竹地步入社会胸有成竹地步入社会入社会胸有成竹地步入社会其实不过就是从心里认为幸福是他至始至终的坚持，傻逼幸福来自于内在བོད་ཡིག加我q257890678奥巴马.光溜溜阳痿致幻剂~指环"
wrk.headers["Content-Type"] = "application/x-www-form-urlencoded"


function setup(thread)
   thread:set("id", counter)
   table.insert(threads, thread)
   counter = counter + 1
end

function init(args)
   requests  = 0
   responses = 0
   s200 = 0
   ns200 = 0
   accept = 0
   review = 0
   reject = 0
   error = 0

   local msg = "thread %d created"
   print(msg:format(id))
end

function request()
   requests = requests + 1
   return wrk.request()
end

function response(status, headers, body)
    responses = responses + 1
    if status ~= 200 then
        ns200 = ns200 + 1
    else
        s200 = s200 + 1
    end

    
    local decision = parsejson(body)

    if not decision then
      error = error + 1
    elseif decision == 'Accept' then
      accept = accept + 1
    elseif decision == 'Review' then
      review = review + 1
    else
      reject = reject + 1
    end

    -- local msg = "status: %d decision: %s"
    -- print(msg:format(status, decision))
    
end

function parsejson(body)
    if body == nil then
      return nil
    end

    local json = cjson.decode(body)
    if not json then
      return nil
    end
    if json.reason_code == 600 then
      return nil
    end
    return json.final_decision
end

function done(summary, latency, requests)
   for index, thread in ipairs(threads) do
      local id        = thread:get("id")
      local requests  = thread:get("requests")
      local responses = thread:get("responses")
      local s200      = thread:get("s200")
      local ns200     = thread:get("ns200")
      local accept    = thread:get("accept")
      local review    = thread:get("review")
      local reject    = thread:get("reject") 
      local error     = thread:get("error")
      local msg = "thread %d made %d requests and got %d responses, %d status(200) and %d status(not 200), decision: accept(%d),review(%d),reject(%d),error(%d)"
      print(msg:format(id, requests, responses, s200, ns200, accept, review, reject, error))
   end
end

输出结果为：

(base) ➜  wrk git:(master) ✗ ./wrk -t4 -c100 -d30s --latency -s scripts/text.lua https://dapistg.xxx.cn/antifraud/v1
thread 1 created
thread 2 created
thread 3 created
thread 4 created
Running 30s test @ https://dapistg.xxx.cn/antifraud/v1
  4 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    75.62ms   40.79ms 443.36ms   84.73%
    Req/Sec   342.04     77.34   610.00     72.20%
  Latency Distribution
     50%   68.81ms
     75%   88.59ms
     90%  114.57ms
     99%  251.99ms
  40030 requests in 30.10s, 13.80MB read
Requests/sec:   1330.11
Transfer/sec:    469.59KB
thread 1 made 10054 requests and got 10028 responses, 10028 status(200) and 0 status(not 200), decision: accept(8),review(3),reject(362),error(9655)
thread 2 made 10542 requests and got 10518 responses, 10518 status(200) and 0 status(not 200), decision: accept(6),review(5),reject(376),error(10131)
thread 3 made 10113 requests and got 10088 responses, 10088 status(200) and 0 status(not 200), decision: accept(7),review(0),reject(349),error(9732)
thread 4 made 9421 requests and got 9396 responses, 9396 status(200) and 0 status(not 200), decision: accept(3),review(1),reject(380),error(9012)

我们增加了对决策结果的统计，其中accept（8）表示8个接受、review（3）表示3个可疑、reject（362）表示362个拒绝。虽然线程1我们发送了10054个请求，但只有10028个请求有响应，且状态为200。实际情况，我们的请求只有373个到达服务端，错误的请求为9655（限流拦截了）。

总结
wrk作为http压测还是非常简便的，但是要想应对更多复杂场景，就需要多熟悉lua的使用，深入了解wrk提供的那几个函数。其它http压测工具，jmeter、apache ab、siege也可以了解一下。