Привет, я новичок в этом мире. У меня есть эти сообщения на сервере хранения и вычислительных узлах. Весь кластер замедляется или не отвечает. Как я могу справиться с этой проблемой? Я использую CentOS 7.5 3.10.0-862.el7.x86_64.
Feb 25 12:43:27 compute-01-02 kernel: nfsd: page allocation failure: order:7, mode:0x80d0
Feb 25 12:43:27 compute-01-02 kernel: CPU: 20 PID: 7230 Comm: nfsd Kdump: loaded Tainted: G W OE ------------ 3.10.0-862.el7.x86_64 #1
Feb 25 12:43:27 compute-01-02 kernel: Hardware name: Dell Inc. PowerEdge R7425/08V001, BIOS 1.10.6 08/15/2019
Feb 25 12:43:27 compute-01-02 kernel: Call Trace:
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffff8b90d768>] dump_stack+0x19/0x1b
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffff8b399df0>] warn_alloc_failed+0x110/0x180
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffff8b39e974>] __alloc_pages_nodemask+0x9b4/0xbb0
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffff8b23117f>] dma_generic_alloc_coherent+0x8f/0x140
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffff8b269811>] x86_swiotlb_alloc_coherent+0x21/0x50
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffffc0611ce4>] mlx5_dma_zalloc_coherent_node+0xb4/0x110 [mlx5_core]
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffffc061214d>] mlx5_buf_alloc_node+0x4d/0xc0 [mlx5_core]
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffffc06121d4>] mlx5_buf_alloc+0x14/0x20 [mlx5_core]
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffffc06f2b39>] create_kernel_qp.isra.62+0x42e/0x72c [mlx5_ib]
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffff8b4b313f>] ? debugfs_create_file+0x1f/0x30
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffffc060a978>] ? add_res_tree+0xe8/0x150 [mlx5_core]
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffffc06e24dd>] create_qp_common+0x67d/0x13c0 [mlx5_ib]
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffffc060b4eb>] ? mlx5_debug_cq_add+0x4b/0x70 [mlx5_core]
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffff8b3f7606>] ? kmem_cache_alloc_trace+0x1d6/0x200
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffffc06e350b>] mlx5_ib_create_qp+0x10b/0x4d0 [mlx5_ib]
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffffc050d1bf>] ib_create_qp+0x7f/0x330 [ib_core]
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffffc05466b4>] rdma_create_qp+0x34/0xb0 [rdma_cm]
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffffc0888eea>] svc_rdma_accept+0x35a/0x800 [rpcrdma]
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffff8b9107fc>] ? schedule_timeout+0x17c/0x2c0
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffffc0887930>] ? svc_rdma_detach+0x40/0x40 [rpcrdma]
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffffc06a1954>] svc_recv+0x454/0xb90 [sunrpc]
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffffc0466edd>] nfsd+0xcd/0x150 [nfsd]
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffffc0466e10>] ? nfsd_destroy+0x80/0x80 [nfsd]
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffff8b2bae31>] kthread+0xd1/0xe0
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffff8b2bad60>] ?
insert_kthread_work+0x40/0x40
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffff8b91f624>] ret_from_fork_nospec_begin+0xe/0x21
Feb 25 12:43:27 compute-01-02 kernel: [<ffffffff8b2bad60>] ? insert_kthread_work+0x40/0x40
Feb 25 12:43:27 compute-01-02 kernel: Mem-Info:
Feb 25 12:43:27 compute-01-02 kernel: active_anon:1413907 inactive_anon:824581 isolated_anon:0#012 active_file:459128 inactive_file:60216179
isolated_file:24#012 unevictable:0 dirty:546270 writeback:0 unstable:0#012 slab_reclaimable:1598985 slab_unreclaimable:152878#012 mapped:4852 shmem:4375 pagetables:13071 bounce:0#012 free:212904 free_pcp:2323 free_cma:0
Feb 25 12:43:27 compute-01-02 kernel: Node 0 DMA free:15864kB min:12kB low:12kB high:16kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:40kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Feb 25 12:43:27 compute-01-02 kernel: lowmem_reserve[]: 0 1480 31677 31677