Назад | Перейти на главную страницу

Отчет об использовании памяти ядром в журнале, когда задача завершается из-за OOM

Моя задача внутри контейнера Docker убивается из-за OOM. Вот журнал из /var/log/messsages.

Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.346602] uwsgi invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=0
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.351446] uwsgi cpuset=4ad797e0720ad05c90cb8f5afaa9902172c4aac9319d464e669091615b52d134 mems_allowed=0
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.356702] CPU: 0 PID: 3969 Comm: uwsgi Tainted: G            E   4.1.13-19.31.amzn1.x86_64 #1
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.361608] Hardware name: Xen HVM domU, BIOS 4.2.amazon 12/07/2015
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.364705]  ffff8800b841d000 ffff88003737bc88 ffffffff814dabc0 0000000000002ecc
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.368574]  ffff880037821980 ffff88003737bd38 ffffffff814d8377 0000000000000046
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.372644]  ffff880037a10000 ffff88003737bd08 ffffffff810949f1 00000000000000fb
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.377628] Call Trace:
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.379220]  [<ffffffff814dabc0>] dump_stack+0x45/0x57
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.382510]  [<ffffffff814d8377>] dump_header+0x7f/0x1fe
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.385580]  [<ffffffff810949f1>] ? try_to_wake_up+0x1f1/0x340
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.388683]  [<ffffffff8115cd37>] ? find_lock_task_mm+0x47/0xa0
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.391730]  [<ffffffff8115d2ec>] oom_kill_process+0x1cc/0x3b0
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.394933]  [<ffffffff81071f0e>] ? has_capability_noaudit+0x1e/0x30
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.398486]  [<ffffffff811c2814>] mem_cgroup_oom_synchronize+0x574/0x5c0
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.402132]  [<ffffffff811be940>] ? mem_cgroup_css_online+0x260/0x260
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.406394]  [<ffffffff8115dc44>] pagefault_out_of_memory+0x24/0xe0
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.410541]  [<ffffffff814d6c37>] mm_fault_error+0x5e/0x106
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.414017]  [<ffffffff8105dd5c>] __do_page_fault+0x3ec/0x420
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.416967]  [<ffffffff811e6577>] ? __fget_light+0x57/0x70
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.419762]  [<ffffffff8105ddb2>] do_page_fault+0x22/0x30
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.422658]  [<ffffffff814e3618>] page_fault+0x28/0x30
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.425484] Task in /docker/4ad797e0720ad05c90cb8f5afaa9902172c4aac9319d464e669091615b52d134 killed as a result of limit of /docker/4ad797e0720ad05c90cb8f5afaa9902172c4aac9319d464e669091615b52d134
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.435927] memory: usage 2560000kB, limit 2560000kB, failcnt 25331
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.439759] memory+swap: usage 2560000kB, limit 5120000kB, failcnt 0
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.443422] kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.446548] Memory cgroup stats for /docker/4ad797e0720ad05c90cb8f5afaa9902172c4aac9319d464e669091615b52d134: cache:3036KB rss:2556964KB rss_huge:0KB mapped_file:2304KB writeback:0KB swap:0KB inactive_anon:2128KB active_anon:2556964KB inactive_file:472KB active_file:436KB unevictable:0KB
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.460987] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.466314] [ 3509]  1000  3509     1112       20       8       3        0             0 sh
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.471474] [ 3531]  1000  3531     4496       60      14       3        0             0 entrypoint.sh
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.476400] [ 3915]  1000  3915    15167     2629      34       3        0             0 supervisord
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.480969] [ 3968]  1000  3968    19371      287      40       3        0             0 nginx
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.485910] [ 3969]  1000  3969    64026    21101     125       4        0             0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.491238] [ 3970]  1000  3970    19855      778      38       3        0             0 nginx
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.495781] [ 3971]  1000  3971    19855      778      38       3        0             0 nginx
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.501098] [ 3972]  1000  3972    19888      784      38       3        0             0 nginx
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.505670] [ 3973]  1000  3973    19855      780      38       3        0             0 nginx
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.510069] [ 3974]  1000  3974    19855      780      38       3        0             0 nginx
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.514864] [ 4126]  1000  4126    66981    22660     129       4        0             0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.520305] [ 4127]  1000  4127    66597    22268     128       4        0             0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.525661] [ 4128]  1000  4128    66725    22435     128       4        0             0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.530141] [ 4129]  1000  4129    66533    22221     128       4        0             0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.534828] [ 4130]  1000  4130    66533    22222     128       4        0             0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.539419] [ 4131]  1000  4131    66469    22183     128       4        0             0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.544847] [ 4132]  1000  4132    66661    22363     128       4        0             0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.550356] [ 4133]  1000  4133    68150    23216     131       4        0             0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.554855] [ 4134]  1000  4134    67812    22904     131       4        0             0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.559747] [ 4135]  1000  4135   373902   327874     731       4        0             0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.564379] [ 4136]  1000  4136    70710    26423     138       4        0             0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.569562] [ 4137]  1000  4137    66213    21886     127       4        0             0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.575567] [ 4138]  1000  4138    80402    35499     160       4        0             0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.580316] [ 4139]  1000  4139    77374    31183     150       4        0             0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.584705] [ 4140]  1000  4140    96267    50678     197       4        0             0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.589036] [ 4141]  1000  4141   119173    55555     211       4        0             0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.593748] [ 4930]  1000  4930     4541       80      15       3        0             0 bash
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.599021] [ 4945]  1000  4945    85517    22532     135       4        0             0 python
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.605828] Memory cgroup out of memory: Kill process 4135 (uwsgi) score 513 or sacrifice child
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.609448] Killed process 4135 (uwsgi) total-vm:1495608kB, anon-rss:1311424kB, file-rss:72kB

Но почему он говорит, что общая память использовала 2,6 ГБ, а затем перечисляет процессы, и если вы просуммируете их, общая память будет меньше 1 ГБ. И фактический процесс, который был убит, говорит ~ 360 МБ, но на самом деле, когда я отслеживаю его во время OOM с htop он намного превышает 1 ГБ. Возникает вопрос: почему ядро ​​сообщает неверное значение памяти?