Назад | Перейти на главную страницу

Случайный сбой MariaDB 10.3 InnoDB из-за нехватки памяти на VPS

после перехода на новый VPS на моем сервере иногда возникают сбои базы данных. Это может произойти 1-2 раза за 2-3 дня. Сервер работает под управлением MariaDB 10.3 на CentOS 7 с nginx 1.16. Виртуальный сервер имеет 1 ГБ памяти, подкачки нет. Во время обычного дневного пикового трафика я вижу, что занято около 600–700 МБ памяти, а 300–400 МБ свободно.

Я нашел журналы в /var/log/messages (ниже), из которого я понял, что InnoDB вылетает из-за нехватки памяти (пожалуйста, поправьте меня, если я неправильно прочитал). Но я не обнаружил никаких DDoS-атак или другой подозрительной активности на момент сбоя.

Есть ли способ выяснить, в чем причина исчерпания памяти? Должен ли я настроить какой-нибудь буфер / настройку InnoDB ..? Что мне делать, чтобы избежать сбоев?

На предыдущем VPS сервер работал с этой конфигурацией в течение 4-5 лет без сбоев (Maria 10.1 / nginx / CentOS 6).

Я не специалист по базам данных, поэтому буду благодарен за любой совет. Спасибо.

Вот это /var/log/messages: (вылет начался в 22:20)

May 27 22:20:01 mysite systemd: Created slice User Slice of root.
May 27 22:20:01 mysite systemd: Started Session 28800 of user root.
May 27 22:20:01 mysite systemd: Created slice User Slice of nginx.
May 27 22:20:01 mysite systemd: Started Session 28801 of user nginx.
May 27 22:20:11 mysite kernel: mysqld invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
May 27 22:20:11 mysite kernel: mysqld cpuset=/ mems_allowed=0
May 27 22:20:11 mysite kernel: CPU: 0 PID: 26756 Comm: mysqld Not tainted 3.10.0-957.12.1.el7.x86_64 #1
May 27 22:20:11 mysite kernel: Hardware name: OpenStack Foundation OpenStack Nova, BIOS 1.10.1-1ubuntu1~cloud0 04/01/2014
May 27 22:20:11 mysite kernel: Call Trace:
May 27 22:20:11 mysite kernel: [<ffffffff96363021>] dump_stack+0x19/0x1b
May 27 22:20:11 mysite kernel: [<ffffffff9635da4a>] dump_header+0x90/0x229
May 27 22:20:11 mysite kernel: [<ffffffff95d01092>] ? ktime_get_ts64+0x52/0xf0
May 27 22:20:11 mysite kernel: [<ffffffff95d582df>] ? delayacct_end+0x8f/0xb0
May 27 22:20:11 mysite kernel: [<ffffffffc033f71a>] ? virtballoon_oom_notify+0x2a/0x70 [virtio_balloon]
May 27 22:20:11 mysite kernel: [<ffffffff95dba634>] oom_kill_process+0x254/0x3d0
May 27 22:20:11 mysite kernel: [<ffffffff95dba0dd>] ? oom_unkillable_task+0xcd/0x120
May 27 22:20:11 mysite kernel: [<ffffffff95dba186>] ? find_lock_task_mm+0x56/0xc0
May 27 22:20:11 mysite kernel: [<ffffffff95dbae76>] out_of_memory+0x4b6/0x4f0
May 27 22:20:11 mysite kernel: [<ffffffff9635e54e>] __alloc_pages_slowpath+0x5d6/0x724
May 27 22:20:11 mysite kernel: [<ffffffff95dc1254>] __alloc_pages_nodemask+0x404/0x420
May 27 22:20:11 mysite kernel: [<ffffffff95e0e148>] alloc_pages_current+0x98/0x110
May 27 22:20:11 mysite kernel: [<ffffffff95db6497>] __page_cache_alloc+0x97/0xb0
May 27 22:20:11 mysite kernel: [<ffffffff95db90f8>] filemap_fault+0x298/0x490
May 27 22:20:11 mysite kernel: [<ffffffffc01fdd0e>] __xfs_filemap_fault+0x7e/0x1d0 [xfs]
May 27 22:20:11 mysite kernel: [<ffffffff95cc2e00>] ? wake_bit_function+0x40/0x40
May 27 22:20:11 mysite kernel: [<ffffffffc01fdf0c>] xfs_filemap_fault+0x2c/0x30 [xfs]
May 27 22:20:11 mysite kernel: [<ffffffff95de462a>] __do_fault.isra.59+0x8a/0x100
May 27 22:20:11 mysite kernel: [<ffffffff95de4bdc>] do_read_fault.isra.61+0x4c/0x1b0
May 27 22:20:11 mysite kernel: [<ffffffff95de9584>] handle_pte_fault+0x2f4/0xd10
May 27 22:20:11 mysite kernel: [<ffffffff95dec0bd>] handle_mm_fault+0x39d/0x9b0
May 27 22:20:11 mysite kernel: [<ffffffff963705e3>] __do_page_fault+0x203/0x4f0
May 27 22:20:11 mysite kernel: [<ffffffff963709b6>] trace_do_page_fault+0x56/0x150
May 27 22:20:11 mysite kernel: [<ffffffff9636ff42>] do_async_page_fault+0x22/0xf0
May 27 22:20:11 mysite kernel: [<ffffffff9636c788>] async_page_fault+0x28/0x30
May 27 22:20:11 mysite kernel: Mem-Info:
May 27 22:20:11 mysite kernel: active_anon:187693 inactive_anon:17396 isolated_anon:0#012 active_file:1781 inactive_file:3687 isolated_file:0#012 unevictable:0 dirty:0 writeback:0 unstable:0#012 slab_reclaimable:3639 slab_unreclaimable:7716#012 mapped:10218 shmem:27944 pagetables:13399 bounce:0#012 free:12232 free_pcp:30 free_cma:0
May 27 22:20:11 mysite kernel: Node 0 DMA free:4592kB min:708kB low:884kB high:1060kB active_anon:8412kB inactive_anon:996kB active_file:0kB inactive_file:216kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:472kB shmem:1420kB slab_reclaimable:60kB slab_unreclaimable:380kB kernel_stack:16kB pagetables:728kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:48 all_unreclaimable? yes
May 27 22:20:11 mysite kernel: lowmem_reserve[]: 0 972 972 972
May 27 22:20:11 mysite kernel: Node 0 DMA32 free:44212kB min:44344kB low:55428kB high:66516kB active_anon:742360kB inactive_anon:68588kB active_file:7124kB inactive_file:14532kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1032048kB managed:998980kB mlocked:0kB dirty:0kB writeback:0kB mapped:40400kB shmem:110356kB slab_reclaimable:14496kB slab_unreclaimable:30484kB kernel_stack:2704kB pagetables:52868kB unstable:0kB bounce:0kB free_pcp:240kB local_pcp:240kB free_cma:0kB writeback_tmp:0kB pages_scanned:15611 all_unreclaimable? yes
May 27 22:20:11 mysite kernel: lowmem_reserve[]: 0 0 0 0
May 27 22:20:11 mysite kernel: Node 0 DMA: 2*4kB (EM) 3*8kB (EM) 5*16kB (UE) 4*32kB (UE) 2*64kB (UM) 1*128kB (E) 2*256kB (UE) 3*512kB (UEM) 2*1024kB (UE) 0*2048kB 0*4096kB = 4592kB
May 27 22:20:11 mysite kernel: Node 0 DMA32: 403*4kB (UEM) 597*8kB (UEM) 288*16kB (UEM) 116*32kB (UEM) 41*64kB (UEM) 30*128kB (UEM) 18*256kB (UEM) 18*512kB (UEM) 7*1024kB (UEM) 1*2048kB (U) 0*4096kB = 44212kB
May 27 22:20:11 mysite kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
May 27 22:20:11 mysite kernel: 33433 total pagecache pages
May 27 22:20:11 mysite kernel: 0 pages in swap cache
May 27 22:20:11 mysite kernel: Swap cache stats: add 0, delete 0, find 0/0
May 27 22:20:11 mysite kernel: Free swap  = 0kB
May 27 22:20:11 mysite kernel: Total swap = 0kB
May 27 22:20:11 mysite kernel: 262010 pages RAM
May 27 22:20:11 mysite kernel: 0 pages HighMem/MovableOnly
May 27 22:20:11 mysite kernel: 8288 pages reserved
May 27 22:20:11 mysite kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
May 27 22:20:11 mysite kernel: [ 1321]     0  1321     9404      357      23        0             0 systemd-journal
May 27 22:20:11 mysite kernel: [ 1349]     0  1349    11119      141      24        0         -1000 systemd-udevd
May 27 22:20:11 mysite kernel: [ 1573]     0  1573    13880      114      28        0         -1000 auditd
May 27 22:20:11 mysite kernel: [ 1933]    81  1933    14524      138      34        0          -900 dbus-daemon
May 27 22:20:11 mysite kernel: [ 1961]     0  1961    48774      118      35        0             0 gssproxy
May 27 22:20:11 mysite kernel: [ 2001]     0  2001     6594       83      17        0             0 systemd-logind
May 27 22:20:11 mysite kernel: [ 2027]   999  2027   153158     1754      59        0             0 polkitd
May 27 22:20:11 mysite kernel: [ 2116]   998  2116    29446      115      30        0             0 chronyd
May 27 22:20:11 mysite kernel: [ 2385]     0  2385    89521     5530      94        0             0 firewalld
May 27 22:20:11 mysite kernel: [ 2961]     0  2961    26866      500      49        0             0 dhclient
May 27 22:20:11 mysite kernel: [ 3022]     0  3022   143481     2781      98        0             0 tuned
May 27 22:20:11 mysite kernel: [ 3189]     0  3189    22577      280      44        0             0 master
May 27 22:20:11 mysite kernel: [ 3191]    89  3191    22647      286      43        0             0 qmgr
May 27 22:20:11 mysite kernel: [ 3228]     0  3228    28216      255      58        0         -1000 sshd
May 27 22:20:11 mysite kernel: [ 3229]     0  3229    66328     1182      59        0             0 rsyslogd
May 27 22:20:11 mysite kernel: [ 3235]     0  3235    31579      170      18        0             0 crond
May 27 22:20:11 mysite kernel: [ 3236]     0  3236    27526       33      10        0             0 agetty
May 27 22:20:11 mysite kernel: [ 3237]     0  3237    27526       33      11        0             0 agetty
May 27 22:20:11 mysite kernel: [15958]     0 15958   130428     6710     147        0             0 php-fpm
May 27 22:20:11 mysite kernel: [15982]     0 15982    14721      288      25        0             0 nginx
May 27 22:20:11 mysite kernel: [15983]   997 15983    15403      975      31        0             0 nginx
May 27 22:20:11 mysite kernel: [21323]   997 21323   245378    11185     347        0             0 php-fpm
May 27 22:20:11 mysite kernel: [21327]   997 21327   246079    12329     348        0             0 php-fpm
May 27 22:20:11 mysite kernel: [21331]   997 21331   244508     9902     345        0             0 php-fpm
May 27 22:20:11 mysite kernel: [21582]   997 21582   225663    12179     340        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 4073]   997  4073   224018     8808     337        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 4076]   997  4076   223517     8433     335        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5510]   997  5510   244374     7954     342        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5517]   997  5517   244782     8575     344        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5519]   997  5519   244994     9463     345        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5520]   997  5520   244763     8907     344        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5521]   997  5521   223316     8418     335        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5522]   997  5522   244493     8212     343        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5523]   997  5523   223509     9244     335        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5524]   997  5524   223996     8921     336        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5527]   997  5527   224017     8661     335        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5536]   997  5536   244945     9254     346        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5546]   997  5546   223528     8214     334        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5551]   997  5551   244849     8735     343        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5557]   997  5557   223438     8154     333        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5558]   997  5558   244862     8565     344        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5559]   997  5559   244877     8601     343        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5561]   997  5561   223924     8706     338        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5562]   997  5562   224018     8866     336        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5563]   997  5563   224008     9187     335        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5564]   997  5564   244965     8904     344        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5566]   997  5566   244403     8290     348        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5567]   997  5567   244920     9447     344        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5568]   997  5568   244514     8736     343        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5569]   997  5569   223477     7876     334        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5570]   997  5570   224004     9054     335        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5574]   997  5574   223319     7945     333        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5575]   997  5575   244729     8709     343        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5576]   997  5576   223435     8296     336        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5577]   997  5577   244711     8679     343        0             0 php-fpm
May 27 22:20:11 mysite kernel: [ 5578]   997  5578   244467     8119     344        0             0 php-fpm
May 27 22:20:11 mysite kernel: [26393]   996 26393   302816    41380     170        0             0 mysqld
May 27 22:20:11 mysite kernel: [10107]    89 10107    22603      271      45        0             0 pickup
May 27 22:20:11 mysite kernel: [11273]     0 11273    45598      242      45        0             0 crond
May 27 22:20:11 mysite kernel: [11274]     0 11274    45598      242      45        0             0 crond
May 27 22:20:11 mysite kernel: [11275]     0 11275    28294       45      11        0             0 sh
May 27 22:20:11 mysite kernel: [11276]   997 11276    28294       43      11        0             0 sh
May 27 22:20:11 mysite kernel: [11277]   997 11277    96384     2413     139        0             0 php
May 27 22:20:11 mysite kernel: [11278]     0 11278    50559     1613      54        0             0 python
May 27 22:20:11 mysite kernel: Out of memory: Kill process 26393 (mysqld) score 163 or sacrifice child
May 27 22:20:11 mysite kernel: Killed process 26393 (mysqld) total-vm:1211264kB, anon-rss:165520kB, file-rss:0kB, shmem-rss:0kB
May 27 22:20:11 mysite systemd: mariadb.service: main process exited, code=killed, status=9/KILL
May 27 22:20:11 mysite systemd: Unit mariadb.service entered failed state.
May 27 22:20:11 mysite systemd: mariadb.service failed.
May 27 22:20:11 mysite systemd: Removed slice User Slice of nginx.
May 27 22:20:16 mysite systemd: mariadb.service holdoff time over, scheduling restart.
May 27 22:20:16 mysite systemd: Stopped MariaDB 10.3.14 database server.
May 27 22:20:16 mysite systemd: Starting MariaDB 10.3.14 database server...
May 27 22:20:16 mysite mysqld: 2019-05-27 22:20:16 0 [Note] /usr/sbin/mysqld (mysqld 10.3.14-MariaDB-log) starting as process 11329 ...
May 27 22:20:17 mysite systemd: Started MariaDB 10.3.14 database server.
May 27 22:20:27 mysite systemd: Removed slice User Slice of root.
May 27 22:21:02 mysite systemd: Created slice User Slice of nginx.
May 27 22:21:02 mysite systemd: Started Session 28802 of user nginx.

А вот соответствующая /var/log/mysql/errors.log от этой аварии:

Version: '10.3.14-MariaDB-log'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MariaDB Server
2019-05-27 22:20:16 0 [Note] InnoDB: Using Linux native AIO
2019-05-27 22:20:16 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2019-05-27 22:20:16 0 [Note] InnoDB: Uses event mutexes
2019-05-27 22:20:16 0 [Note] InnoDB: Compressed tables use zlib 1.2.7
2019-05-27 22:20:16 0 [Note] InnoDB: Number of pools: 1
2019-05-27 22:20:16 0 [Note] InnoDB: Using SSE2 crc32 instructions
2019-05-27 22:20:16 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
2019-05-27 22:20:16 0 [Note] InnoDB: Completed initialization of buffer pool
2019-05-27 22:20:16 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2019-05-27 22:20:16 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=699238373
2019-05-27 22:20:17 0 [Note] InnoDB: 128 out of 128 rollback segments are active.
2019-05-27 22:20:17 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
2019-05-27 22:20:17 0 [Note] InnoDB: Creating shared tablespace for temporary tables
2019-05-27 22:20:17 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2019-05-27 22:20:17 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
2019-05-27 22:20:17 0 [Note] InnoDB: Waiting for purge to start
2019-05-27 22:20:17 0 [Note] InnoDB: 10.3.14 started; log sequence number 699238382; transaction id 395866
2019-05-27 22:20:17 0 [Note] Plugin 'FEEDBACK' is disabled.
2019-05-27 22:20:17 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
2019-05-27 22:20:17 0 [Note] Recovering after a crash using tc.log
2019-05-27 22:20:17 0 [Note] InnoDB: Buffer pool(s) load completed at 190527 22:20:17
2019-05-27 22:20:17 0 [Note] Starting crash recovery...
2019-05-27 22:20:17 0 [Note] Crash recovery finished.
2019-05-27 22:20:17 0 [Note] Server socket created on IP: '::'.
2019-05-27 22:20:17 0 [Note] Reading of all Master_info entries succeded
2019-05-27 22:20:17 0 [Note] Added new Master_info '' to hash table
2019-05-27 22:20:17 0 [Note] /usr/sbin/mysqld: ready for connections.
Version: '10.3.14-MariaDB-log'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MariaDB Server

Я использую настройки по умолчанию, ничего особенного в etc/my.cnf.d/server.cnf:

# this is only for the mysqld standalone daemon
[mysqld]
key_buffer_size=16M
max_allowed_packet=8M
# log also slow queries:
slow_query_log = 1
long_query_time = 3
slow_query_log_file=/var/log/mysql/mysite-slow-query.log
#log_queries_not_using_indexes
log_error=/var/log/mysql/mysite-errors.log

Трудно проанализировать требования к памяти для такой крошечной (1 ГБ) системы. Тем более, что у вас есть весь стек в одной виртуальной машине.

Очевидный виновник - очень много потоков PHP. Они, наверное, сидят без дела. Делает nginx есть способ настроить количество запускаемых потоков? Если это так, установите его намного ниже, скажем 6. Это может быть лучшим быстрым решением.

Предложения mysqltuner были больше нацелены на производительность, чем на память. Итак, не вносите никаких из этих изменений; большинство из них усугубят проблему, выделив больше оперативной памяти.

С другой стороны, 128 МБ для buffer_pool - это опасно много при 1 ГБ ОЗУ. Тем не менее, не меняйте его (вверх или вниз), если вы не получите что-то конкретно указывающее на это в журнале ошибок.