Назад | Перейти на главную страницу

Почему nginx автоматически перезагружается (и не делает этого)

Время от времени, по непонятной мне причине (я очень дилетантский администратор), nginx автоматически перезапускается, но этого не происходит, что приводит к прерыванию обслуживания. Это случилось сегодня утром:

$ journalctl -u nginx
-- Logs begin at Mon 2018-09-03 11:46:24 CEST, end at Tue 2018-09-04 09:30:22 CEST. --
Sep 04 07:55:21 vpsXXXXXX.ovh.net systemd[1]: Stopping A high performance web server and a reverse proxy server...
Sep 04 07:55:21 vpsXXXXXX.ovh.net systemd[1]: Stopped A high performance web server and a reverse proxy server.
Sep 04 07:55:27 vpsXXXXXX.ovh.net systemd[1]: Starting A high performance web server and a reverse proxy server...
Sep 04 07:55:27 vpsXXXXXX.ovh.net nginx[29333]: nginx: [warn] could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_siz
Sep 04 07:55:27 vpsXXXXXX.ovh.net nginx[29335]: nginx: [warn] could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_siz
Sep 04 07:55:28 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:443 failed (98: Address already in use)
Sep 04 07:55:28 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
Sep 04 07:55:28 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:5281 failed (98: Address already in use)
Sep 04 07:55:28 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:443 failed (98: Address already in use)
Sep 04 07:55:28 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
Sep 04 07:55:28 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:5281 failed (98: Address already in use)
Sep 04 07:55:29 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:443 failed (98: Address already in use)
Sep 04 07:55:29 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
Sep 04 07:55:29 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:5281 failed (98: Address already in use)
Sep 04 07:55:29 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:443 failed (98: Address already in use)
Sep 04 07:55:29 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
Sep 04 07:55:29 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:5281 failed (98: Address already in use)
Sep 04 07:55:30 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:443 failed (98: Address already in use)
Sep 04 07:55:30 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
Sep 04 07:55:30 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:5281 failed (98: Address already in use)
Sep 04 07:55:30 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] still could not bind()
Sep 04 07:55:30 vpsXXXXXX.ovh.net systemd[1]: nginx.service: Control process exited, code=exited status=1
Sep 04 07:55:30 vpsXXXXXX.ovh.net systemd[1]: Failed to start A high performance web server and a reverse proxy server.
Sep 04 07:55:30 vpsXXXXXX.ovh.net systemd[1]: nginx.service: Unit entered failed state.
Sep 04 07:55:30 vpsXXXXXX.ovh.net systemd[1]: nginx.service: Failed with result 'exit-code'.

Спустя 1:30 я понимаю, что мой веб-сервер не работает, поэтому просто systemctl restart nginx и все работает нормально.

Sep 04 09:23:48 vpsXXXXXX.ovh.net systemd[1]: Starting A high performance web server and a reverse proxy server...
Sep 04 09:23:48 vpsXXXXXX.ovh.net nginx[30003]: nginx: [warn] could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_siz
Sep 04 09:23:48 vpsXXXXXX.ovh.net nginx[30004]: nginx: [warn] could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_siz
Sep 04 09:23:48 vpsXXXXXX.ovh.net systemd[1]: nginx.service: Failed to read PID from file /run/nginx.pid: Invalid argument
Sep 04 09:23:48 vpsXXXXXX.ovh.net systemd[1]: Started A high performance web server and a reverse proxy server.

К сожалению, я не записал последний раз, когда это произошло, и журналы явно слишком старые:

$ zgrep "bind()" /var/log/nginx/*
(just this morning's episode)

... но я уверен, что в этом году у меня была подобная проблема как минимум 2 или 3 раза (что приемлемо, но раздражает).

Кажется, это не связано с перезапуском сервера:

$ uptime
 09:43:37 up 12 days,  7:00,  1 user,  load average: 0.00, 0.00, 0.00

Это мой пользователь root crontab, я не думаю, что это связано, поскольку расписание, похоже, не совпадает, но поскольку я не могу придумать, где искать, так что вот оно:

# let's encrypt renewal
0 5 1 * * certbot renew --authenticator standalone --installer  nginx --pre-hook "systemctl stop nginx" --post-hook "systemctl start nginx" -n
# automatically update debian
0 5 * * 1 apt -qq update && apt dist-upgrade -qq -y
# couldn't find another way to make sure that datetime.today() actually returns today's date in my flask apps
0 0 * * * systemctl reload uwsgi.service

Чем это вызвано?

РЕДАКТИРОВАТЬ: В /var/log/nginx/error.log, Я могу найти дополнительную информацию:

2018/09/04 07:55:19 [warn] 29269#29269: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/04 07:55:19 [info] 29269#29269: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/04 07:55:20 [warn] 29271#29271: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/04 07:55:20 [info] 29271#29271: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/04 07:55:20 [warn] 29273#29273: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/04 07:55:20 [info] 29273#29273: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/04 07:55:26 [warn] 29305#29305: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/04 07:55:26 [notice] 29305#29305: signal process started
2018/09/04 07:55:26 [error] 29305#29305: open() "/run/nginx.pid" failed (2: No such file or directory)
2018/09/04 07:55:26 [warn] 29306#29306: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/04 07:55:27 [warn] 29333#29333: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
(then the same lines journalctl displays)

Почему это open() "/run/nginx.pid" failed (2: No such file or directory) случиться вдруг?

РЕДАКТИРОВАТЬ2: Это случилось снова сегодня утром:

$ cat /var/log/nginx/error.log
2018/09/13 10:12:19 [warn] 7230#7230: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 10:12:19 [info] 7230#7230: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/13 10:12:31 [warn] 7243#7243: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 10:12:31 [notice] 7243#7243: signal process started
2018/09/13 10:12:31 [error] 7243#7243: open() "/run/nginx.pid" failed (2: No such file or directory)
2018/09/13 10:12:31 [warn] 7244#7244: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 10:12:32 [warn] 7247#7247: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 10:12:32 [info] 7247#7247: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/13 10:12:32 [warn] 7251#7251: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 10:12:32 [info] 7251#7251: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/13 10:12:33 [warn] 7255#7255: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 10:12:33 [info] 7255#7255: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/13 10:12:33 [warn] 7256#7256: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:443 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:80 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:5281 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:443 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:80 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:5281 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:443 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:80 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:5281 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:443 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:80 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:5281 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:443 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:80 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:5281 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: still could not bind()
2018/09/13 10:12:36 [alert] 7245#7245: unlink() "/run/nginx.pid" failed (2: No such file or directory)
-------- this is when I realized my web services are down and manually restart nginx
2018/09/13 11:25:11 [warn] 7578#7578: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 11:25:11 [info] 7578#7578: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/13 11:25:12 [warn] 7579#7579: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size

Думаю, я просто добавлю cronjob, который проверяет каждые 5 минут, если nginx встало и запустите, если это не так, но это настолько грязно, что меня рвет.