Несколько раз в день наш nginx (1.1.19 ubuntu 12.04 lts) зависает на несколько секунд (самое долгое было 53 секунды) и ожидает передачи данных. Для клиента нет ошибки, запрос просто занимает больше времени. Это относится к все запросы в этот период времени (cgi или модуль состояния), и все запросы будут обработаны, как только остановка закончится.
У меня есть сеанс экрана на сервере и каждую секунду скручиваю страницу состояния:
{"time":"2016-09-02T10:10:21+02:00","host":"app1","data":"nginx","reading":1,"writing":4,"waiting":0}
{"time":"2016-09-02T10:10:22+02:00","host":"app1","data":"nginx","reading":1,"writing":4,"waiting":0}
{"time":"2016-09-02T10:10:50+02:00","host":"app1","data":"nginx","reading":3,"writing":9,"waiting":0}
{"time":"2016-09-02T10:11:43+02:00","host":"app1","data":"nginx","reading":5,"writing":98,"waiting":0}
{"time":"2016-09-02T10:11:44+02:00","host":"app1","data":"nginx","reading":0,"writing":25,"waiting":0}
{"time":"2016-09-02T10:11:45+02:00","host":"app1","data":"nginx","reading":3,"writing":7,"waiting":0}
Разрыв не из-за ошибок, но запрос имеет большую внешнюю продолжительность, чем обычно. В журнале доступа нет зарегистрированных ошибочных запросов. Вы можете просто заметить разрыв.
127.0.0.1 - - [02/Sep/2016:10:10:17 +0200] "GET /basic_status HTTP/1.1" 200 121 "-" "curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3"
127.0.0.1 - - [02/Sep/2016:10:10:18 +0200] "GET /basic_status HTTP/1.1" 200 121 "-" "curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3"
127.0.0.1 - - [02/Sep/2016:10:10:20 +0200] "GET /basic_status HTTP/1.1" 200 121 "-" "curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3"
127.0.0.1 - - [02/Sep/2016:10:10:21 +0200] "GET /basic_status HTTP/1.1" 200 121 "-" "curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3"
127.0.0.1 - - [02/Sep/2016:10:10:22 +0200] "GET /basic_status HTTP/1.1" 200 121 "-" "curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3"
Я проверил журнал ошибок nginx и журнал fpm и т. Д., Но в настоящее время ошибок нет.
user www-data;
worker_processes 4;
pid /var/run/nginx.pid;
events {
worker_connections 768;
}
http {
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
include /etc/nginx/mime.types;
default_type application/octet-stream;
large_client_header_buffers 4 80k;
access_log /var/log/nginx/access.log;
error_log /var/log/nginx/error.log;
gzip on;
gzip_disable "msie6";
include /etc/nginx/conf.d/*.conf;
include /etc/nginx/sites-enabled/*;
}
server {
listen 80;
server_name app1;
access_log /var/log/nginx/access.log;
error_log /var/log/nginx/error.log;
root /var/www;
location /basic_status {
stub_status on;
}
}
Я также регистрирую количество TIME_WAIT-Quadruples, но не большое число:
{"time":"2016-09-02T10:10:54+02:00","host":"app1","data":"time_wait","httpFromLB":1153,"httpFromLocal":604,"mysqlToDb":1988,"memcacheToLocal":250, "cgiToLocal":1527}
{"time":"2016-09-02T10:10:55+02:00","host":"app1","data":"time_wait","httpFromLB":1153,"httpFromLocal":604,"mysqlToDb":1991,"memcacheToLocal":251, "cgiToLocal":1527}
{"time":"2016-09-02T10:10:56+02:00","host":"app1","data":"time_wait","httpFromLB":1153,"httpFromLocal":604,"mysqlToDb":1992,"memcacheToLocal":252, "cgiToLocal":1527}
{"time":"2016-09-02T10:10:57+02:00","host":"app1","data":"time_wait","httpFromLB":902,"httpFromLocal":496,"mysqlToDb":1628,"memcacheToLocal":213, "cgiToLocal":1236}
{"time":"2016-09-02T10:10:58+02:00","host":"app1","data":"time_wait","httpFromLB":902,"httpFromLocal":496,"mysqlToDb":1629,"memcacheToLocal":214, "cgiToLocal":1236}
{"time":"2016-09-02T10:10:59+02:00","host":"app1","data":"time_wait","httpFromLB":902,"httpFromLocal":496,"mysqlToDb":1631,"memcacheToLocal":215, "cgiToLocal":1236}
{"time":"2016-09-02T10:11:00+02:00","host":"app1","data":"time_wait","httpFromLB":902,"httpFromLocal":496,"mysqlToDb":1632,"memcacheToLocal":216, "cgiToLocal":1236}
{"time":"2016-09-02T10:11:01+02:00","host":"app1","data":"time_wait","httpFromLB":902,"httpFromLocal":496,"mysqlToDb":1633,"memcacheToLocal":217, "cgiToLocal":1236}
{"time":"2016-09-02T10:11:03+02:00","host":"app1","data":"time_wait","httpFromLB":902,"httpFromLocal":496,"mysqlToDb":1636,"memcacheToLocal":218, "cgiToLocal":1236}
{"time":"2016-09-02T10:11:04+02:00","host":"app1","data":"time_wait","httpFromLB":902,"httpFromLocal":496,"mysqlToDb":1637,"memcacheToLocal":219, "cgiToLocal":1236}
Само приложение я исключил из своих исследований, так как первая строка кода будет записывать метку времени, и это время в конце задержки.
Я понятия не имею, где исследовать дальше. Любые идеи?