Skip to content

压力测试结果

Li Jiang edited this page Mar 23, 2017 · 9 revisions

结果汇总

API Complete requests Concurrency Level Requests per second Time per request(ms) Time per request ( across all concurrent requests )
GET /api 100000 50 11346 4.4 0.088
GET /api 100000 10000 9692.68 1031.706 0.103
GET /api/auth 10000 50 945.48 52.883 1.058
POST /api/auth 10000 50 264.68 188.908 3.778
DELETE /api/auth - - - - -
GET /api/file// 10000 50 192.76 259.392 5.188
GET /api/file//?download=True 10000 50 774.88 64.526 1.291
PUT /api/file// 10000 50 841.84 59.394 1.188
POST /api/command/ 10000 50 850.56 58.785 1.176
POST /api/command/ 500 50 574.06 87.098 1.742
GET /api/job/ 1000 50 608.39 82.183 1.644
GET /api/job//<ID> 1000 50 59.29 843.380 16.868
DELETE /api/job/ - - - - -
GET /api/async/<id> 10000 50 839 56 1.1
  • Complete requests 总请求数量,测试设置的参数。
  • Concurrency Level 并发用户数,测试设置的参数。
  • Requests per second 吞吐率
  • Time per request 1 用户平均请求等待时间
  • Time per request 2 (across all concurrent request) 服务器平均请求等待时间

单节点部署时的性能

1 server 1 worker ; gunicorn/19.6.0

在进行压测前,先通过 ab -n1 -v4 进行正确性验证

/api

ab -n10000 -c 1000 http://cn16356:8000/api

  • N C : 1000 100
Complete requests:      1000
Failed requests:        0
Time per request:       16.202
Time per request:       0.162 [ms] (mean, across all concurrent requests)
50%     11
100%     74 (longest request)
  • N C : 10000 1000
Complete requests:      10000
Failed requests:        0
Write errors:           0
Time per request:       125.145 [ms] (mean)
Time per request:       0.125 [ms] (mean, across all concurrent requests)
  50%     72
 100%    505 (longest request)
  • N C : 100000 10000
Complete requests:      100000
Failed requests:        58
   (Connect: 0, Receive: 0, Length: 58, Exceptions: 0)
Time per request:       1019.286 [ms] (mean)
Time per request:       0.102 [ms] (mean, across all concurrent requests)
  50%    151
 100%   9289 (longest request)
  • N C : 100000 50
Concurrency Level:      50
Time taken for tests:   8.813 seconds
Complete requests:      100000
Failed requests:        0
Write errors:           0
Total transferred:      43900000 bytes
HTML transferred:       10900000 bytes
Requests per second:    11346.73 [#/sec] (mean)
Time per request:       4.407 [ms] (mean)
Time per request:       0.088 [ms] (mean, across all concurrent requests)
Transfer rate:          4864.47 [Kbytes/sec] received
Percentage of the requests served within a certain time (ms)
  50%      3
 100%     99 (longest request)

在同时十万个请求的压力下,gunicorn 还是能有很好的响应

对比 manager.py runserver

* N C : 100000 10000
Total of 7112 requests completed
* N C : 10000 1000
Total of 5742 requests completed
* N C : 1000 100
Total of 998 requests completed
* N C : 500 50
Complete requests:      500
Failed requests:        0
Time per request:       105.049 [ms] (mean)
Time per request:       2.101 [ms] (mean, across all concurrent requests)
  50%     21
 100%    623 (longest request)

单纯的 manager runserver 性能差了非常多 , 响应数差了 3~4个量级, 速度也差了一个量级。 对其他的API 不再进行 manager runserver 的测试。

/api/auth

同样可达十万级并发

ab -n100000 -c 10000 http://cn16356:8000/api/auth

不带COOKIES 的测试结果:

Complete requests:      100000
Failed requests:        72
   (Connect: 0, Receive: 0, Length: 72, Exceptions: 0)
Time per request:       1102.097 [ms] (mean)
Time per request:       0.110 [ms] (mean, across all concurrent requests)
  50%    178
 100%   7629 (longest request)

带COOKIES 的测试结果:

ab -C newt_sessionid=ioqx6g5hzzayp76vi7x26e1gr7cgq9qc -n100000  http://cn16356:8000/api/auth

涉及了web server 数据库的操作,而sqlite 的速度一般,响应慢了很多。

Complete requests:      10000
Failed requests:        977
   (Connect: 0, Receive: 0, Length: 977, Exceptions: 0)
Requests per second:    316.20 [#/sec] (mean)
Time per request:       316.253 [ms] (mean)
Time per request:       3.163 [ms] (mean, across all concurrent requests)
Transfer rate:          155.87 [Kbytes/sec] received
  50%     87
 100%  14068 (longest request)

考虑 -c 的不同,换用 -c50 吞吐量明显提升 :

Complete requests:      10000
Failed requests:        0
Requests per second:    945.48 [#/sec] (mean)
Time per request:       52.883 [ms] (mean)
Time per request:       1.058 [ms] (mean, across all concurrent requests)
  50%     46
 100%    378 (longest request)

/api/auth POST

post 请求性能会差点儿,不过性能依然能达到万级的并发, 不过比较合适的峰值并发量是在 同时数千个请求,处理速度为 250 + /sec 。

 ab -n 10000 -c 1000 -p 'login.txt' -T  'application/x-www-form-urlencoded'  http://cn16356:8000/api/auth

测试结果 :

N , C : 1000 , 100
Complete requests:      1000
Failed requests:        0
Time per request:       387.322 [ms] (mean)
Time per request:       3.873 [ms] (mean, across all concurrent requests)
  50%    324
 100%   1101 (longest request)
N,C  : 10000 , 1000
Complete requests:      10000
Failed requests:        933
   (Connect: 0, Receive: 0, Length: 933, Exceptions: 0)
Requests per second:    270.39 [#/sec] (mean)
Time per request:       3698.332 [ms] (mean)
Time per request:       3.698 [ms] (mean, across all concurrent requests)
  50%   3079
 100%  11303 (longest request)

/api/auth DELETE

ab 没有测试delete 请求的办法,即使有也很难测试。应该介于GET 和 POST 之间。

/api/file/<machine>/<path>

十万级的压力,可正常响应。

ab -C newt_sessionid=ioqx6g5hzzayp76vi7x26e1gr7cgq9qc -n10000 -c 100 http://cn16356:8000/api/file/ln3/~/log.1

测试发现 c 太大时会崩溃。

Complete requests:      10000
Failed requests:        0
Requests per second:    728.03 [#/sec] (mean)
Time per request:       137.357 [ms] (mean)
Time per request:       1.374 [ms] (mean, across all concurrent requests)
  50%    107
 100%   1747 (longest request)

/api/file/<machine>/<path>?download=True

ab -C newt_sessionid=ioqx6g5hzzayp76vi7x26e1gr7cgq9qc -n10000 -c 100  http://cn16356:8000/api/file/ln3/~/log.1?download=True
Complete requests:      10000
Failed requests:        0
Requests per second:    833.91 [#/sec] (mean)
Time per request:       119.917 [ms] (mean)
Time per request:       1.199 [ms] (mean, across all concurrent requests)
Transfer rate:          368.09 [Kbytes/sec] received
  50%    105
 100%    627 (longest request)

得益于异步机制,可以很快的响应,而实际文件下载需要的时间就要看文件系统的状态了。

/api/file/<machine>/<token>?download=True

竟然比其他的API更快。

ab -C newt_sessionid=ioqx6g5hzzayp76vi7x26e1gr7cgq9qc -n10000 -c50  http://cn16356:8000/api/file/ln3/6a5d8402-db00-4fd1-867a-e2693c70b77d?download=True
Complete requests:      10000
Failed requests:        0
Requests per second:    774.88 [#/sec] (mean)
Time per request:       64.526 [ms] (mean)
Time per request:       1.291 [ms] (mean, across all concurrent requests)
Transfer rate:          304.96 [Kbytes/sec] received
  50%     56
 100%    650 (longest request)

PUT /api/file/<machine>/<PATH>

 ab -C newt_sessionid=ioqx6g5hzzayp76vi7x26e1gr7cgq9qc -n10000 -c50 -u ~/log.1  http://cn16356:8000/api/file/ln3/~/log.2

速度不错,似乎是-c50 比较合适 ? , 返回 /auth 模块验证确实如此。

Complete requests:      10000
Failed requests:        0
Requests per second:    841.84 [#/sec] (mean)
Time per request:       59.394 [ms] (mean)
Time per request:       1.188 [ms] (mean, across all concurrent requests)
  50%     54
 100%    327 (longest request)

GET POST /api/command/<macchine>

ab -C newt_sessionid=ioqx6g5hzzayp76vi7x26e1gr7cgq9qc -p "command.txt" -T 'application/x-www-form-urlencoded' -c50 -n10000 http://cn16356:8000/api/command/ln3

Complete requests:      10000
Failed requests:        0
Write errors:           0
Requests per second:    850.56 [#/sec] (mean)
Time per request:       58.785 [ms] (mean)
Time per request:       1.176 [ms] (mean, across all concurrent requests)
Transfer rate:          375.44 [Kbytes/sec] received
                        202.67 kb/s sent
                        578.11 kb/s total
  50%     54
 100%    319 (longest request)

POST /api/job/<machine>

ab -C newt_sessionid=ioqx6g5hzzayp76vi7x26e1gr7cgq9qc -p "job.txt" -T 'application/x-www-form-urlencoded' -n100 -c20 http://cn16356:8000/api/job/ln3

Complete requests:      100
Failed requests:        0
Write errors:           0
Total transferred:      45200 bytes
Total body sent:        25300
HTML transferred:       10300 bytes
Requests per second:    460.05 [#/sec] (mean)
Time per request:       43.473 [ms] (mean)
Time per request:       2.174 [ms] (mean, across all concurrent requests)
Transfer rate:          203.07 [Kbytes/sec] received
                        113.67 kb/s sent
                        316.74 kb/s total
  50%     32
 100%     84 (longest request)

ab -C newt_sessionid=ioqx6g5hzzayp76vi7x26e1gr7cgq9qc -p "job.txt" -T 'application/x-www-form-urlencoded' -n500 -c50 http://cn16356:8000/api/job/ln3

Complete requests:      500
Failed requests:        0
Write errors:           0
Total transferred:      226000 bytes
Total body sent:        126500
HTML transferred:       51500 bytes
Requests per second:    574.06 [#/sec] (mean)
Time per request:       87.098 [ms] (mean)
Time per request:       1.742 [ms] (mean, across all concurrent requests)
Transfer rate:          253.40 [Kbytes/sec] received
                        141.83 kb/s sent
                        395.23 kb/s total
  50%     56
 100%    354 (longest request)

请求的响应速度还是很快,但在500个的压力测试下,SLURM 果断的出错了(100个的时候还是比较正常的 ) :

yhqueue: error: slurm_receive_msg: Socket timed out on send/recv operation
slurm_load_jobs error: Socket timed out on send/recv operation

而通过 ACCT 去检查状态,所有的作业都正常响应了。

GET /api/job/<machine>

ab -C newt_sessionid=ioqx6g5hzzayp76vi7x26e1gr7cgq9qc  -n1000 -c50 http://cn16356:8000/api/job/ln3/

Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      452000 bytes
HTML transferred:       103000 bytes
Requests per second:    608.39 [#/sec] (mean)
Time per request:       82.183 [ms] (mean)
Time per request:       1.644 [ms] (mean, across all concurrent requests)
Transfer rate:          268.55 [Kbytes/sec] received
  50%     52
 100%    457 (longest request)

GET /api/job/<machine>/<ID>

怕影响SLURM , 暂时不测试? 这个是通过yhacct 去查信息,还有DJANGO 的数据库操作,速度慢了很多 :

ab -C newt_sessionid=ioqx6g5hzzayp76vi7x26e1gr7cgq9qc  -n1000 -c50 http://cn16356:8000/api/job/ln3/5028248
Concurrency Level:      50
Time taken for tests:   16.868 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      452000 bytes
HTML transferred:       103000 bytes
Requests per second:    59.29 [#/sec] (mean)
Time per request:       843.380 [ms] (mean)
Time per request:       16.868 [ms] (mean, across all concurrent requests)
Transfer rate:          26.17 [Kbytes/sec] received

Percentage of the requests served within a certain time (ms)
  50%    739
 100%   5267 (longest request)

DELETE /api/job/

cannot use ab to test delete

GET /api/async/

ab -C newt_sessionid=ioqx6g5hzzayp76vi7x26e1gr7cgq9qc  -n10000 -c50 http://cn16356:8000/api/async/ce898f36-ff30-49e4-8a9d-b8c9c4cb6e7e

Concurrency Level:      50
Time taken for tests:   11.198 seconds
Complete requests:      10000
Failed requests:        0
Write errors:           0
Total transferred:      5110000 bytes
HTML transferred:       1670000 bytes
Requests per second:    893.01 [#/sec] (mean)
Time per request:       55.990 [ms] (mean)
Time per request:       1.120 [ms] (mean, across all concurrent requests)
Transfer rate:          445.64 [Kbytes/sec] received
  50%     49
 100%    445 (longest request)

测试过程中发现的问题

  • 测试过程中发现如果临时文件夹文件太多就会导致系统处于不可用状态。
  • 突然对SLURM 进行大量的yhbatch 请求,会使其暂时陷入 error: slurm_receive_msg: Socket timed out on send/recv operation slurm_load_jobs error: Socket timed out on send/recv operation

解决方案:

  1. 检查临时文件个数 或 再次改变文件传输方式 ,断点直接传输可能是更好的方案 。
  2. job 采用单独的队列,并限制合适的流速。
  3. 针对教育平台,针对小的作业采用THHT 的流作业处理模式。