Внезапно без видимой причины мой сервер перестает отвечать. Вот что я нашел в / var / log / messages. Что это может быть?
Apr 29 13:40:47 stephan kernel: ------------[ cut here ]------------
Apr 29 13:40:47 stephan kernel: WARNING: at lib/list_debug.c:56 __list_del_entry+0x82/0xd0()
Apr 29 13:40:47 stephan kernel: Hardware name: S5520SC
Apr 29 13:40:47 stephan kernel: list_del corruption. next->prev should be ffff880c86f92000, but was ffff880c86f92800
Apr 29 13:40:47 stephan kernel: Modules linked in: nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack iptable_filter ip_tables bonding ip6t_REJECT nf_conntrack_ipv6 nf_defr$
Apr 29 13:40:47 stephan kernel: Pid: 66, comm: kswapd1 Not tainted 3.0.0+ #1
Apr 29 13:40:47 stephan kernel: Call Trace:
Apr 29 13:40:47 stephan kernel: <IRQ> [<ffffffff81062b2f>] warn_slowpath_common+0x7f/0xc0
Apr 29 13:40:47 stephan kernel: [<ffffffff8101b927>] ? intel_pmu_enable_all+0xa7/0x160
Apr 29 13:40:47 stephan kernel: [<ffffffff81062c26>] warn_slowpath_fmt+0x46/0x50
Apr 29 13:40:47 stephan kernel: [<ffffffff81268c72>] __list_del_entry+0x82/0xd0
Apr 29 13:40:47 stephan kernel: [<ffffffff81268cd1>] list_del+0x11/0x40
Apr 29 13:40:47 stephan kernel: [<ffffffff8114ba5b>] free_block+0xcb/0x180
Apr 29 13:40:47 stephan kernel: [<ffffffff8114b8e0>] kmem_cache_free+0x290/0x2b0
Apr 29 13:40:47 stephan kernel: [<ffffffff811ba941>] proc_i_callback+0x31/0x40
Apr 29 13:40:47 stephan kernel: [<ffffffff810ce6bc>] rcu_do_batch+0xdc/0x250
Apr 29 13:40:47 stephan kernel: [<ffffffff810ce8e4>] __rcu_process_callbacks+0xb4/0x1d0
Apr 29 13:40:47 stephan kernel: [<ffffffff810cea25>] rcu_process_callbacks+0x25/0x50
Apr 29 13:40:47 stephan kernel: [<ffffffff81069847>] __do_softirq+0xb7/0x210
Apr 29 13:40:47 stephan kernel: [<ffffffff810898c1>] ? hrtimer_interrupt+0x151/0x240
Apr 29 13:40:47 stephan kernel: [<ffffffff8150317c>] call_softirq+0x1c/0x30
Apr 29 13:40:47 stephan kernel: [<ffffffff8100d345>] do_softirq+0x65/0xa0
Apr 29 13:40:47 stephan kernel: [<ffffffff8106964d>] irq_exit+0xbd/0xe0
Apr 29 13:40:47 stephan kernel: <IRQ> [<ffffffff81062b2f>] warn_slowpath_common+0x7f/0xc0
Apr 29 13:40:47 stephan kernel: [<ffffffff8101b927>] ? intel_pmu_enable_all+0xa7/0x160
Apr 29 13:40:47 stephan kernel: [<ffffffff81062c26>] warn_slowpath_fmt+0x46/0x50
Apr 29 13:40:47 stephan kernel: [<ffffffff81268c72>] __list_del_entry+0x82/0xd0
Apr 29 13:40:47 stephan kernel: [<ffffffff81268cd1>] list_del+0x11/0x40
Apr 29 13:40:47 stephan kernel: [<ffffffff8114ba5b>] free_block+0xcb/0x180
Apr 29 13:40:47 stephan kernel: [<ffffffff8114b8e0>] kmem_cache_free+0x290/0x2b0
Apr 29 13:40:47 stephan kernel: [<ffffffff811ba941>] proc_i_callback+0x31/0x40
Apr 29 13:40:47 stephan kernel: [<ffffffff810ce6bc>] rcu_do_batch+0xdc/0x250
Apr 29 13:40:47 stephan kernel: [<ffffffff810ce8e4>] __rcu_process_callbacks+0xb4/0x1d0
Apr 29 13:40:47 stephan kernel: [<ffffffff810cea25>] rcu_process_callbacks+0x25/0x50
Apr 29 13:40:47 stephan kernel: [<ffffffff81069847>] __do_softirq+0xb7/0x210
Apr 29 13:40:47 stephan kernel: [<ffffffff810898c1>] ? hrtimer_interrupt+0x151/0x240
Apr 29 13:40:47 stephan kernel: [<ffffffff8150317c>] call_softirq+0x1c/0x30
Apr 29 13:40:47 stephan kernel: [<ffffffff8100d345>] do_softirq+0x65/0xa0
Apr 29 13:40:47 stephan kernel: [<ffffffff8106964d>] irq_exit+0xbd/0xe0
Apr 29 13:40:47 stephan kernel: [<ffffffff81503abe>] smp_apic_timer_interrupt+0x6e/0x99
Apr 29 13:40:47 stephan kernel: [<ffffffff81502933>] apic_timer_interrupt+0x13/0x20
Apr 29 13:40:47 stephan kernel: <EOI> [<ffffffffa03d9b08>] ? xfs_perag_get_tag+0x8/0xd0 [xfs]
Apr 29 13:40:47 stephan kernel: [<ffffffffa03f3968>] xfs_reclaim_inode_shrink+0x58/0xb0 [xfs]
Apr 29 13:40:47 stephan kernel: [<ffffffff81113191>] shrink_slab+0x81/0x1a0
Apr 29 13:40:47 stephan kernel: [<ffffffff811162ee>] balance_pgdat+0x70e/0x8f0
Apr 29 13:40:47 stephan kernel: [<ffffffff81116696>] kswapd+0x1c6/0x210
Apr 29 13:40:47 stephan kernel: [<ffffffff811164d0>] ? balance_pgdat+0x8f0/0x8f0
Apr 29 13:40:47 stephan kernel: [<ffffffff81084d16>] kthread+0x96/0xa0
Apr 29 13:40:47 stephan kernel: [<ffffffff81503084>] kernel_thread_helper+0x4/0x10
Apr 29 13:40:47 stephan kernel: [<ffffffff81084c80>] ? kthread_worker_fn+0x1a0/0x1a0
Apr 29 13:40:47 stephan kernel: [<ffffffff81503080>] ? gs_change+0x13/0x13
Apr 29 13:40:47 stephan kernel: ---[ end trace 40eb9c6ec15a76bf ]---
Apr 29 13:40:47 stephan kernel: ------------[ cut here ]------------
Apr 29 13:40:47 stephan kernel: WARNING: at lib/list_debug.c:53 __list_del_entry+0xa1/0xd0()
Apr 29 13:40:47 stephan kernel: Hardware name: S5520SC
Apr 29 13:40:47 stephan kernel: list_del corruption. prev->next should be ffff880c798a3000, but was 7f07e74200000000
Apr 29 13:40:47 stephan kernel: ------------[ cut here ]------------
Apr 29 13:40:47 stephan kernel: WARNING: at lib/list_debug.c:53 __list_del_entry+0xa1/0xd0()
Apr 29 13:40:47 stephan kernel: Hardware name: S5520SC
Apr 29 13:40:47 stephan kernel: list_del corruption. prev->next should be ffff880da7db9000, but was ffff880caa441000
Apr 29 13:40:47 stephan kernel: Modules linked in: nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack iptable_filter ip_tables bonding ip6t_REJECT nf_conntrack_ipv6 nf_defr$
Apr 29 13:40:47 stephan kernel: Pid: 66, comm: kswapd1 Tainted: G W 3.0.0+ #1
Apr 29 13:40:47 stephan kernel: Call Trace:
Apr 29 13:40:47 stephan kernel: <IRQ> [<ffffffff81062b2f>] warn_slowpath_common+0x7f/0xc0
Apr 29 13:40:47 stephan kernel: [<ffffffff8101b927>] ? intel_pmu_enable_all+0xa7/0x160
Apr 29 13:40:47 stephan kernel: [<ffffffff81062c26>] warn_slowpath_fmt+0x46/0x50
Apr 29 13:40:47 stephan kernel: [<ffffffff81268c91>] __list_del_entry+0xa1/0xd0
Apr 29 13:40:47 stephan kernel: [<ffffffff81268cd1>] list_del+0x11/0x40
Apr 29 13:40:47 stephan kernel: [<ffffffff8114ba5b>] free_block+0xcb/0x180
Apr 29 13:40:47 stephan kernel: [<ffffffff8114b8e0>] kmem_cache_free+0x290/0x2b0
Apr 29 13:40:47 stephan kernel: [<ffffffff811ba941>] proc_i_callback+0x31/0x40
Apr 29 13:40:47 stephan kernel: [<ffffffff810ce6bc>] rcu_do_batch+0xdc/0x250
Apr 29 13:40:47 stephan kernel: [<ffffffff810ce8e4>] __rcu_process_callbacks+0xb4/0x1d0
Apr 29 13:40:47 stephan kernel: [<ffffffff810cea25>] rcu_process_callbacks+0x25/0x50
Apr 29 13:40:47 stephan kernel: [<ffffffff81069847>] __do_softirq+0xb7/0x210
Я использую centos6 64bit, это не виртуальная машина, и система работала без проблем целый год. Три месяца назад я обновил процессор до x5680. Надеюсь, дело не в процессоре, потому что он был довольно дорогим.
Хотя нам нужно гораздо больше информации (версия ядра, сколько времени машина работала до этого, оборудование), я хотел бы обратить ваше внимание на ffff880c86f92000, но было ffff880c86f92800-line, что означает, что бит №11 изменился с 0 на 1. Если у вас нет оперативной памяти ECC, я предлагаю проверить вашу память.
Apr 29 13:40:47 stephan kernel: list_del corruption. next->prev should be ffff880c86f92000, but was ffff880c86f92800