Назад | Перейти на главную страницу

Linux OOM-убийца действует, несмотря на много доступной памяти

Примерно раз в неделю OOM-killer сбивает процесс postgres на моем сервере, несмотря на то, что он «бесплатный» заявляет, что у него много доступной памяти.

Я прочитал несколько тем здесь и там, но не вижу никаких реальных объяснений. Это действительно потому что на сервере теперь есть своп? Это ошибка ядра (Ubuntu)?

И в порядке очереди, да, возможно, добавлю своп. Но разве нет другого решения? Или хотя бы объяснение? :)

Server: Physical Dell
Memory: 64gb RAM and 0 Swap
uname: Linux server-name 4.4.0-62-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Postgres version: 9.5.10  (8gb shared memory)
vm.overcommit_memory = 0

Выход free -m незадолго до последнего убийства

              total        used        free      shared  buff/cache   available
Mem:          64312        2666         450      8699    61196        52126
Swap:             0           0           0

Журнал ядра с последнего убийства

Jun 19 21:29:49 server-name kernel: [17009377.877956] bash invoked oom-killer: gfp_mask=0x26000c0, order=2, oom_score_adj=0
Jun 19 21:29:49 server-name kernel: [17009377.877959] bash cpuset=/ mems_allowed=0-1
Jun 19 21:29:49 server-name kernel: [17009377.877964] CPU: 23 PID: 61771 Comm: bash Not tainted 4.4.0-62-generic #83-Ubuntu
Jun 19 21:29:49 server-name kernel: [17009377.877966] Hardware name: Dell Inc. PowerEdge M630/0R10KJ, BIOS 2.4.2 01/09/2017
Jun 19 21:29:49 server-name kernel: [17009377.877967]  0000000000000286 00000000d566bdbf ffff88001369baf0 ffffffff813f7c63
Jun 19 21:29:49 server-name kernel: [17009377.877969]  ffff88001369bcc8 ffff88010d85b800 ffff88001369bb60 ffffffff8120ad4e
Jun 19 21:29:49 server-name kernel: [17009377.877971]  0000000000000000 0000000000000700 ffffffff81e42a40 ffff8810547850c0
Jun 19 21:29:49 server-name kernel: [17009377.877973] Call Trace:
Jun 19 21:29:49 server-name kernel: [17009377.877979]  [] dump_stack+0x63/0x90
Jun 19 21:29:49 server-name kernel: [17009377.877984]  [] dump_header+0x5a/0x1c5
Jun 19 21:29:49 server-name kernel: [17009377.877988]  [] oom_kill_process+0x202/0x3c0
Jun 19 21:29:49 server-name kernel: [17009377.877990]  [] ? oom_unkillable_task+0x9e/0xd0
Jun 19 21:29:49 server-name kernel: [17009377.877992]  [] out_of_memory+0x219/0x460
Jun 19 21:29:49 server-name kernel: [17009377.877995]  [] __alloc_pages_slowpath.constprop.88+0x8fd/0xa70
Jun 19 21:29:49 server-name kernel: [17009377.877997]  [] __alloc_pages_nodemask+0x286/0x2a0
Jun 19 21:29:49 server-name kernel: [17009377.877999]  [] alloc_kmem_pages_node+0x4b/0xc0
Jun 19 21:29:49 server-name kernel: [17009377.878003]  [] copy_process+0x1be/0x1b70
Jun 19 21:29:49 server-name kernel: [17009377.878007]  [] ? handle_mm_fault+0xce0/0x1820
Jun 19 21:29:49 server-name kernel: [17009377.878010]  [] ? sched_clock+0x9/0x10
Jun 19 21:29:49 server-name kernel: [17009377.878015]  [] ? sched_clock_cpu+0x8f/0xa0
Jun 19 21:29:49 server-name kernel: [17009377.878017]  [] _do_fork+0x80/0x360
Jun 19 21:29:49 server-name kernel: [17009377.878021]  [] ? sigprocmask+0x6f/0xa0
Jun 19 21:29:49 server-name kernel: [17009377.878023]  [] SyS_clone+0x19/0x20
Jun 19 21:29:49 server-name kernel: [17009377.878027]  [] entry_SYSCALL_64_fastpath+0x16/0x71
Jun 19 21:29:49 server-name kernel: [17009377.878028] Mem-Info:
Jun 19 21:29:49 server-name kernel: [17009377.878034] active_anon:2161218 inactive_anon:328736 isolated_anon:0
Jun 19 21:29:49 server-name kernel: [17009377.878034]  active_file:9390648 inactive_file:3525717 isolated_file:0
Jun 19 21:29:49 server-name kernel: [17009377.878034]  unevictable:923 dirty:3206 writeback:0 unstable:0
Jun 19 21:29:49 server-name kernel: [17009377.878034]  slab_reclaimable:427991 slab_unreclaimable:85432
Jun 19 21:29:49 server-name kernel: [17009377.878034]  mapped:2177419 shmem:2227151 pagetables:345413 bounce:0
Jun 19 21:29:49 server-name kernel: [17009377.878034]  free:122878 free_pcp:1 free_cma:0
Jun 19 21:29:49 server-name kernel: [17009377.878037] Node 0 DMA free:14488kB min:20kB low:24kB high:28kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Jun 19 21:29:49 server-name kernel: [17009377.878041] lowmem_reserve[]: 0 1820 32015 32015 32015
Jun 19 21:29:49 server-name kernel: [17009377.878044] Node 0 DMA32 free:123340kB min:2552kB low:3188kB high:3828kB active_anon:1066728kB inactive_anon:123988kB active_file:8kB inactive_file:8kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1985352kB managed:1904732kB mlocked:0kB dirty:0kB writeback:0kB mapped:736056kB shmem:744112kB slab_reclaimable:289824kB slab_unreclaimable:200192kB kernel_stack:1552kB pagetables:84288kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Jun 19 21:29:49 server-name kernel: [17009377.878048] lowmem_reserve[]: 0 0 30195 30195 30195
Jun 19 21:29:49 server-name kernel: [17009377.878062] Node 0 Normal free:154400kB min:42332kB low:52912kB high:63496kB active_anon:6913860kB inactive_anon:1054596kB active_file:10248116kB inactive_file:10244456kB unevictable:72kB isolated(anon):0kB isolated(file):0kB present:31457280kB managed:30919988kB mlocked:72kB dirty:1364kB writeback:0kB mapped:7294996kB shmem:7449576kB slab_reclaimable:826280kB slab_unreclaimable:67216kB kernel_stack:6016kB pagetables:1250072kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Jun 19 21:29:49 server-name kernel: [17009377.878065] lowmem_reserve[]: 0 0 0 0 0
Jun 19 21:29:49 server-name kernel: [17009377.878067] Node 1 Normal free:199284kB min:45200kB low:56500kB high:67800kB active_anon:664284kB inactive_anon:136360kB active_file:27314468kB inactive_file:3858404kB unevictable:3620kB isolated(anon):0kB isolated(file):0kB present:33554432kB managed:33015880kB mlocked:3620kB dirty:11460kB writeback:0kB mapped:678624kB shmem:714916kB slab_reclaimable:595860kB slab_unreclaimable:74320kB kernel_stack:4608kB pagetables:47292kB unstable:0kB bounce:0kB free_pcp:4kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Jun 19 21:29:49 server-name kernel: [17009377.878084] lowmem_reserve[]: 0 0 0 0 0
Jun 19 21:29:49 server-name kernel: [17009377.878086] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 0*64kB 1*128kB (U) 0*256kB 0*512kB 0*1024kB 1*2048kB (M) 3*4096kB (M) = 14488kB
Jun 19 21:29:49 server-name kernel: [17009377.878092] Node 0 DMA32: 12409*4kB (UME) 4093*8kB (UME) 839*16kB (UME) 111*32kB (UME) 68*64kB (UM) 33*128kB (UME) 24*256kB (UME) 14*512kB (UME) 2*1024kB (M) 0*2048kB 0*4096kB = 123292kB
Jun 19 21:29:49 server-name kernel: [17009377.878109] Node 0 Normal: 39107*4kB (UME) 179*8kB (UME) 0*16kB 1*32kB (H) 1*64kB (H) 1*128kB (H) 1*256kB (H) 1*512kB (H) 1*1024kB (H) 0*2048kB 0*4096kB = 159876kB
Jun 19 21:29:49 server-name kernel: [17009377.878115] Node 1 Normal: 50883*4kB (UME) 133*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 204596kB
Jun 19 21:29:49 server-name kernel: [17009377.878120] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Jun 19 21:29:49 server-name kernel: [17009377.878121] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Jun 19 21:29:49 server-name kernel: [17009377.878121] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Jun 19 21:29:49 server-name kernel: [17009377.878122] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Jun 19 21:29:49 server-name kernel: [17009377.878123] 15144218 total pagecache pages
Jun 19 21:29:49 server-name kernel: [17009377.878124] 0 pages in swap cache
Jun 19 21:29:49 server-name kernel: [17009377.878125] Swap cache stats: add 0, delete 0, find 0/0
Jun 19 21:29:49 server-name kernel: [17009377.878126] Free swap  = 0kB
Jun 19 21:29:49 server-name kernel: [17009377.878126] Total swap = 0kB
Jun 19 21:29:49 server-name kernel: [17009377.878127] 16753261 pages RAM
Jun 19 21:29:49 server-name kernel: [17009377.878127] 0 pages HighMem/MovableOnly
Jun 19 21:29:49 server-name kernel: [17009377.878128] 289137 pages reserved
Jun 19 21:29:49 server-name kernel: [17009377.878129] 0 pages cma reserved
Jun 19 21:29:49 server-name kernel: [17009377.878129] 0 pages hwpoisoned
Jun 19 21:29:49 server-name kernel: [17009377.878130] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
Jun 19 21:29:49 server-name kernel: [17009377.878136] [ 1181]     0  1181    21212    12109      47       3        0             0 systemd-journal
Jun 19 21:29:49 server-name kernel: [17009377.878138] [ 1212]     0  1212    23694      319      16       3        0             0 lvmetad
Jun 19 21:29:49 server-name kernel: [17009377.878139] [ 1226]     0  1226    11507     1164      24       3        0         -1000 systemd-udevd
Jun 19 21:29:49 server-name kernel: [17009377.878141] [ 1776]     0  1776     6512      523      18       3        0             0 atd
Jun 19 21:29:49 server-name kernel: [17009377.878142] [ 1778]   107  1778    10758      949      25       3        0          -900 dbus-daemon
Jun 19 21:29:49 server-name kernel: [17009377.878143] [ 1790]   104  1790    64100      955      27       3        0             0 rsyslogd
Jun 19 21:29:49 server-name kernel: [17009377.878144] [ 1794]     0  1794     7708      591      20       3        0             0 cron
Jun 19 21:29:49 server-name kernel: [17009377.878146] [ 1798]     0  1798     7138      656      18       3        0             0 systemd-logind
Jun 19 21:29:49 server-name kernel: [17009377.878147] [ 1800]     0  1800    77434     1220      20       3        0             0 lxcfs
Jun 19 21:29:49 server-name kernel: [17009377.878148] [ 1805]     0  1805    69421     1395      38       4        0             0 accounts-daemon
Jun 19 21:29:49 server-name kernel: [17009377.878150] [ 1807]     0  1807   385362     6819      84       6        0          -900 snapd
Jun 19 21:29:49 server-name kernel: [17009377.878151] [ 1809]     0  1809     1101      173       9       3        0             0 acpid
Jun 19 21:29:49 server-name kernel: [17009377.878152] [ 1835]     0  1835     3345       42      11       3        0             0 mdadm
Jun 19 21:29:49 server-name kernel: [17009377.878154] [ 1852]     0  1852    69296     1880      37       4        0             0 polkitd
Jun 19 21:29:49 server-name kernel: [17009377.878155] [ 1959]     0  1959    16381     1507      36       3        0         -1000 sshd
Jun 19 21:29:49 server-name kernel: [17009377.878156] [ 1972]     0  1972     1307      412       8       3        0             0 iscsid
Jun 19 21:29:49 server-name kernel: [17009377.878158] [ 1973]     0  1973     1432      917       8       3        0           -17 iscsid
Jun 19 21:29:49 server-name kernel: [17009377.878159] [ 2036]     0  2036     4441      383      13       3        0             0 agetty
Jun 19 21:29:49 server-name kernel: [17009377.878160] [ 2095]     0  2095     4934      597      15       3        0             0 irqbalance
Jun 19 21:29:49 server-name kernel: [17009377.878162] [ 2138]   111  2138    27509     1256      25       3        0             0 ntpd
Jun 19 21:29:49 server-name kernel: [17009377.878163] [ 2323]   112  2323    13971      727      30       3        0             0 exim4
Jun 19 21:29:49 server-name kernel: [17009377.878164] [ 2329]     0  2329    73510     4000      43       4        0             0 fail2ban-server
Jun 19 21:29:49 server-name kernel: [17009377.878166] [ 7103]   113  7103  2203146    66729     188       4        0          -900 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878167] [101917]     0 101917    13563      470      27       3        0             0 keepalived
Jun 19 21:29:49 server-name kernel: [17009377.878169] [101918]     0 101918    14093     1204      33       3        0             0 keepalived
Jun 19 21:29:49 server-name kernel: [17009377.878170] [101919]     0 101919    14093      800      32       3        0             0 keepalived
Jun 19 21:29:49 server-name kernel: [17009377.878172] [126772]   115 126772     5994      664      16       4        0             0 nrpe
Jun 19 21:29:49 server-name kernel: [17009377.878174] [70979]   113 70979  2203419  2135243    4232      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878175] [70980]   113 70980  2203211   282046    3748      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878176] [70981]   113 70981  2203146     5331      69       4        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878178] [70982]   113 70982  2203399     1773      71       4        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878179] [70983]   113 70983    37911     1097      55       4        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878180] [70984]   113 70984  2203919   115562    1754      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878182] [70985]   113 70985  2204540    68113    1213      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878183] [70986]   113 70986  2205899   471030    3891      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878184] [70992]   113 70992  2204243   111679    1550      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878185] [70993]   113 70993  2203484     2784      75       4        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878187] [70994]   113 70994  2205941   541014    3966      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878188] [70995]   113 70995  2206035   408079    3095      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878189] [70996]   113 70996  2203934   160075    2604      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878190] [70997]   113 70997  2203936   218125    2911      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878192] [70998]   113 70998  2204811  1327751    4263      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878193] [70999]   113 70999  2206100  2081582    4267      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878194] [71000]   113 71000  2204024  1694269    4251      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878196] [71001]   113 71001  2209678  2127573    4274      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878197] [71002]   113 71002  2204028  1683854    4251      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878198] [71003]   113 71003  2209601  2118203    4273      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878199] [71004]   113 71004  2203982   955099    4247      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878201] [71005]   113 71005  2204924  1348990    4262      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878202] [71006]   113 71006  2203995  1255468    4247      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878203] [71014]   113 71014  2204016  1562410    4251      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878204] [71015]   113 71015  2204199    70592    1039      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878206] [71016]   113 71016  2209670  2063214    4273      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878207] [71023]   113 71023  2206079   537513    3839      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878208] [71024]   113 71024  2203922   125526    1820      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878210] [71025]   113 71025  2203943   230822    3084      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878211] [71027]   113 71027  2206498   625052    4028      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878212] [71029]   113 71029  2204012  1614770    4249      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878214] [71030]   113 71030  2209593  2083374    4272      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878215] [71033]   113 71033  2203940   178025    2673      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878216] [71034]   113 71034  2206090   426476    3624      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878218] [71057]   113 71057  2204867  2144196    4265      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878219] [71058]   113 71058  2204546   224493    2893      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878220] [71113]   113 71113  2209581  2127791    4272      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878222] [71276]   113 71276  2209713  2125684    4274      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878223] [71315]   113 71315  2203984   678258    4234      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878224] [71316]   113 71316  2209663  2137633    4273      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878225] [71425]   113 71425  2203985  1229779    4250      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878226] [71426]   113 71426  2207773  2089808    4271      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878228] [71479]   113 71479  2209624  2137703    4273      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878229] [71566]   113 71566  2205224   109789    1497      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878230] [71634]   113 71634  2204084    42530     640      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878232] [71636]   113 71636  2204166    36964     547      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878233] [71637]   113 71637  2203758     8574     167      10        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878234] [71861]   113 71861  2204034  1659821    4249      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878235] [71878]   113 71878  2204101    27948     404      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878237] [73036]   113 73036  2204208    23556     315      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878238] [73043]   113 73043  2204332    15593     234      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878239] [73167]   113 73167  2209593  2124044    4274      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878240] [73168]   113 73168  2204014  1505503    4251      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878242] [73426]   113 73426  2206039   247425    2686      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878243] [73714]   113 73714  2204379    66347     562       9        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878244] [73735]   113 73735  2207590  2128748    4270      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878246] [74062]   113 74062  2209849  2101538    4274      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878247] [74196]   113 74196  2204283    34071     526      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878248] [74203]   113 74203  2204922   150413    1889      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878249] [74581]   113 74581  2209626  2125857    4273      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878250] [74817]   113 74817  2204033  1637797    4250      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878252] [75281]   113 75281  2204276    56982     690      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878253] [75282]   113 75282  2205141   106605    1299      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878254] [75283]   113 75283  2203982    27531     348      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878256] [75309]   113 75309  2205662   128795    1652      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878257] [76231]   113 76231  2204009    53847     803      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878258] [77386]   113 77386  2203883    40653     602      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878260] [77917]   113 77917  2204815   846041    4248      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878261] [77925]   113 77925  2203999  1394845    4251      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878262] [77957]   113 77957  2204961  1375124    4264      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878263] [77958]   113 77958  2203998  1645874    4250      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878264] [77994]   113 77994  2209598  2029427    4273      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878265] [78004]   113 78004  2204876   990859    4261      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878266] [78009]   113 78009  2209693  2074139    4274      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878268] [78010]   113 78010  2204012  1528597    4248      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878269] [78011]   113 78011  2209708  2130929    4274      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878270] [78012]   113 78012  2209906  2116419    4274      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878271] [78013]   113 78013  2203951   568349    4225      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878272] [78021]   113 78021  2203819    14483     210      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878273] [78028]   113 78028  2209930  2138334    4275      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878274] [78076]   113 78076  2204009  1648542    4248      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878275] [78077]   113 78077  2204008  1622033    4250      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878276] [78231]   113 78231  2209778  2125564    4273      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878278] [78232]   113 78232  2204006  1467730    4251      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878279] [78748]   113 78748  2209554  2091379    4272      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878281] [79120]   113 79120  2207656  2129657    4270      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878282] [79121]   113 79121  2209649  2136786    4274      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878283] [79122]   113 79122  2203949   342314    3972      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878285] [80062]   113 80062  2204008  1257889    4249      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878286] [81048]   113 81048  2205151   130415    1862      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878287] [81050]   113 81050  2203941    43779     627      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878289] [84879]   113 84879  2205000  1285857    4263      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878290] [85403]   113 85403  2204492    74870    1073      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878291] [85404]   113 85404  2204962   112681    1425      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878292] [89649]   113 89649  2204322    56650     734      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878294] [90729]   113 90729  2204495    95699    1334      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878295] [90732]   113 90732  2203804     9328     184      10        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878296] [90755]   113 90755  2204363    81196    1100      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878298] [92006]   113 92006  2204032  1592146    4248      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878299] [95662]   113 95662  2203779    11467     223      11        0             0 postgres
Jun 19 21:29:49 server-name kernel: [17009377.878300] [100918]   113 100918  2204778   843529    4246      11        0             0 postgres
...
Jun 19 21:29:49 server-name kernel: [17009377.878360] [59692]     0 59692     3610      841      12       3        0             0 bash
Jun 19 21:29:49 server-name kernel: [17009377.878361] [61771]     0 61771     3610       83      11       3        0             0 bash
Jun 19 21:29:49 server-name kernel: [17009377.878362] Out of memory: Kill process 71057 (postgres) score 130 or sacrifice child
Jun 19 21:29:49 server-name kernel: [17009377.878616] Killed process 71057 (postgres) total-vm:8819468kB, anon-rss:5948kB, file-rss:8570836kB

Проблема объясняется здесь:

Jun 19 21:29:49 server-name kernel: [17009377.877956] bash invoked oom-killer: gfp_mask=0x26000c0, order=2, oom_score_adj=0

И тут:

Jun 19 21:29:49 server-name kernel: [17009377.878115] Node 1 Normal: 50883*4kB (UME) 133*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 204596kB

Ядро запрашивает память из порядка 2 (это 16 КБ или более), флаги GFP указывают, что она ДОЛЖНА поступать из 'thisnode'. Я предполагаю, что с учетом отсутствия памяти второго или более высокого порядка от узла 1 это узел 1, из которого он пытается.

Флаги также указывают на то, что он может выполнять подкачку, чтобы освободить память, чтобы это работало. Хоть и нет свопа.

Я не знаю наверняка, но я полагаю, что даже небольшой объем подкачки исправит это (поскольку он будет менять местами для удовлетворения запроса), и я также подозреваю, что он разрешит ему выполнить сжатие памяти (это переупорядочивает физическую память для увеличения памяти для более высоких заказов), что позволило бы даже избежать регулярной замены - обратите внимание, что это всего лишь предположение.

Вы можете перейти на более новое ядро. Я думал, что есть изменение, которое улучшает сжатие при распределении файловой системы, но у меня нет удобной фиксации. Это похоже на Ubuntu, так что вы можете попробовать Bionic.

Подробнее о фрагментации зон см. Ответ Мэтью несколько лет назад: Linux oom ситуация. Возможно, вы сможете настроить vm.extfrag_threshold или вручную запустить компактный в аварийной ситуации с /proc/sys/vm/compact_memory