Назад | Перейти на главную страницу

Время от времени Nginx останавливается

Несколько раз в день наш nginx (1.1.19 ubuntu 12.04 lts) зависает на несколько секунд (самое долгое было 53 секунды) и ожидает передачи данных. Для клиента нет ошибки, запрос просто занимает больше времени. Это относится к все запросы в этот период времени (cgi или модуль состояния), и все запросы будут обработаны, как только остановка закончится.

У меня есть сеанс экрана на сервере и каждую секунду скручиваю страницу состояния:

{"time":"2016-09-02T10:10:21+02:00","host":"app1","data":"nginx","reading":1,"writing":4,"waiting":0}
{"time":"2016-09-02T10:10:22+02:00","host":"app1","data":"nginx","reading":1,"writing":4,"waiting":0}
{"time":"2016-09-02T10:10:50+02:00","host":"app1","data":"nginx","reading":3,"writing":9,"waiting":0}
{"time":"2016-09-02T10:11:43+02:00","host":"app1","data":"nginx","reading":5,"writing":98,"waiting":0}
{"time":"2016-09-02T10:11:44+02:00","host":"app1","data":"nginx","reading":0,"writing":25,"waiting":0}
{"time":"2016-09-02T10:11:45+02:00","host":"app1","data":"nginx","reading":3,"writing":7,"waiting":0}

Разрыв не из-за ошибок, но запрос имеет большую внешнюю продолжительность, чем обычно. В журнале доступа нет зарегистрированных ошибочных запросов. Вы можете просто заметить разрыв.

127.0.0.1 - - [02/Sep/2016:10:10:17 +0200] "GET /basic_status HTTP/1.1" 200 121 "-" "curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3"
127.0.0.1 - - [02/Sep/2016:10:10:18 +0200] "GET /basic_status HTTP/1.1" 200 121 "-" "curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3"
127.0.0.1 - - [02/Sep/2016:10:10:20 +0200] "GET /basic_status HTTP/1.1" 200 121 "-" "curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3"
127.0.0.1 - - [02/Sep/2016:10:10:21 +0200] "GET /basic_status HTTP/1.1" 200 121 "-" "curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3"
127.0.0.1 - - [02/Sep/2016:10:10:22 +0200] "GET /basic_status HTTP/1.1" 200 121 "-" "curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3"

Я проверил журнал ошибок nginx и журнал fpm и т. Д., Но в настоящее время ошибок нет.

user www-data;
worker_processes 4;
pid /var/run/nginx.pid;

events {
        worker_connections 768;
}

http {

        sendfile on;
        tcp_nopush on;
        tcp_nodelay on;
        keepalive_timeout 65;
        types_hash_max_size 2048;
        include /etc/nginx/mime.types;
        default_type application/octet-stream;
        large_client_header_buffers 4 80k;
        access_log /var/log/nginx/access.log;
        error_log /var/log/nginx/error.log;
        gzip on;
        gzip_disable "msie6";
        include /etc/nginx/conf.d/*.conf;
        include /etc/nginx/sites-enabled/*;
}

server {
    listen 80;

    server_name app1;

    access_log /var/log/nginx/access.log;
    error_log  /var/log/nginx/error.log;

    root /var/www;

    location /basic_status {
        stub_status on;
    }
}

Я также регистрирую количество TIME_WAIT-Quadruples, но не большое число:

{"time":"2016-09-02T10:10:54+02:00","host":"app1","data":"time_wait","httpFromLB":1153,"httpFromLocal":604,"mysqlToDb":1988,"memcacheToLocal":250, "cgiToLocal":1527}
{"time":"2016-09-02T10:10:55+02:00","host":"app1","data":"time_wait","httpFromLB":1153,"httpFromLocal":604,"mysqlToDb":1991,"memcacheToLocal":251, "cgiToLocal":1527}
{"time":"2016-09-02T10:10:56+02:00","host":"app1","data":"time_wait","httpFromLB":1153,"httpFromLocal":604,"mysqlToDb":1992,"memcacheToLocal":252, "cgiToLocal":1527}
{"time":"2016-09-02T10:10:57+02:00","host":"app1","data":"time_wait","httpFromLB":902,"httpFromLocal":496,"mysqlToDb":1628,"memcacheToLocal":213, "cgiToLocal":1236}
{"time":"2016-09-02T10:10:58+02:00","host":"app1","data":"time_wait","httpFromLB":902,"httpFromLocal":496,"mysqlToDb":1629,"memcacheToLocal":214, "cgiToLocal":1236}
{"time":"2016-09-02T10:10:59+02:00","host":"app1","data":"time_wait","httpFromLB":902,"httpFromLocal":496,"mysqlToDb":1631,"memcacheToLocal":215, "cgiToLocal":1236}
{"time":"2016-09-02T10:11:00+02:00","host":"app1","data":"time_wait","httpFromLB":902,"httpFromLocal":496,"mysqlToDb":1632,"memcacheToLocal":216, "cgiToLocal":1236}
{"time":"2016-09-02T10:11:01+02:00","host":"app1","data":"time_wait","httpFromLB":902,"httpFromLocal":496,"mysqlToDb":1633,"memcacheToLocal":217, "cgiToLocal":1236}
{"time":"2016-09-02T10:11:03+02:00","host":"app1","data":"time_wait","httpFromLB":902,"httpFromLocal":496,"mysqlToDb":1636,"memcacheToLocal":218, "cgiToLocal":1236}
{"time":"2016-09-02T10:11:04+02:00","host":"app1","data":"time_wait","httpFromLB":902,"httpFromLocal":496,"mysqlToDb":1637,"memcacheToLocal":219, "cgiToLocal":1236}

Само приложение я исключил из своих исследований, так как первая строка кода будет записывать метку времени, и это время в конце задержки.

Я понятия не имею, где исследовать дальше. Любые идеи?