Назад | Перейти на главную страницу

Mysql вылетает. Плохой жесткий диск или оборудование?

Я видел высокие нагрузки и сбой mysql 2 раза за 1 неделю. Может ли это быть причиной? Любая идея?

    Jan  3 09:49:19 HOST kernel: [2272100.568769]          res 51/40:38:78:7f:f1/40:00:35:00:00/e0 Emask 0x9 (media error)
    Jan  3 09:49:19 HOST kernel: [2272100.569023] ata2.00: status: { DRDY ERR }
    Jan  3 09:49:19 HOST kernel: [2272100.569089] ata2.00: error: { UNC }
    Jan  3 09:49:19 HOST kernel: [2272100.577394] ata2.00: configured for UDMA/133
    Jan  3 09:49:19 HOST kernel: [2272100.577418] ata2: EH complete
    Jan  3 09:49:26 HOST kernel: [2272107.699341] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
    Jan  3 09:49:26 HOST kernel: [2272107.699569] ata2.00: BMDMA stat 0x25
    Jan  3 09:49:26 HOST kernel: [2272107.699643] ata2.00: failed command: READ DMA EXT
    Jan  3 09:49:26 HOST kernel: [2272107.699713] ata2.00: cmd 25/00:38:78:7f:f1/00:00:35:00:00/e0 tag 0 dma 28672 in
    Jan  3 09:49:26 HOST kernel: [2272107.699715]          res 51/40:38:78:7f:f1/40:00:35:00:00/e0 Emask 0x9 (media error)
    Jan  3 09:49:26 HOST kernel: [2272107.699966] ata2.00: status: { DRDY ERR }
    Jan  3 09:49:26 HOST kernel: [2272107.700030] ata2.00: error: { UNC }
    Jan  3 09:49:26 HOST kernel: [2272107.708509] ata2.00: configured for UDMA/133
    Jan  3 09:49:26 HOST kernel: [2272107.708534] ata2: EH complete
    Jan  3 09:49:33 HOST kernel: [2272114.833522] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
    Jan  3 09:49:33 HOST kernel: [2272114.833603] ata2.00: BMDMA stat 0x25
    Jan  3 09:49:33 HOST kernel: [2272114.833669] ata2.00: failed command: READ DMA EXT
    Jan  3 09:49:33 HOST kernel: [2272114.833737] ata2.00: cmd 25/00:38:78:7f:f1/00:00:35:00:00/e0 tag 0 dma 28672 in
    Jan  3 09:49:33 HOST kernel: [2272114.833739]          res 51/40:38:78:7f:f1/40:00:35:00:00/e0 Emask 0x9 (media error)
    Jan  3 09:49:33 HOST kernel: [2272114.833992] ata2.00: status: { DRDY ERR }
    Jan  3 09:49:33 HOST kernel: [2272114.834056] ata2.00: error: { UNC }
    Jan  3 09:49:33 HOST kernel: [2272114.842578] ata2.00: configured for UDMA/133
    Jan  3 09:49:33 HOST kernel: [2272114.842604] ata2: EH complete
    Jan  3 09:49:40 HOST kernel: [2272121.959563] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
    Jan  3 09:49:40 HOST kernel: [2272121.959644] ata2.00: BMDMA stat 0x25
    Jan  3 09:49:40 HOST kernel: [2272121.959708] ata2.00: failed command: READ DMA EXT
    Jan  3 09:49:40 HOST kernel: [2272121.959778] ata2.00: cmd 25/00:38:78:7f:f1/00:00:35:00:00/e0 tag 0 dma 28672 in
    Jan  3 09:49:40 HOST kernel: [2272121.959780]          res 51/40:38:78:7f:f1/40:00:35:00:00/e0 Emask 0x9 (media error)
    Jan  3 09:49:40 HOST kernel: [2272121.961337] ata2.00: status: { DRDY ERR }
    Jan  3 09:49:40 HOST kernel: [2272121.961400] ata2.00: error: { UNC }
    Jan  3 09:49:40 HOST kernel: [2272121.968673] ata2.00: configured for UDMA/133
    Jan  3 09:49:40 HOST kernel: [2272121.968701] sd 1:0:0:0: [sda] Unhandled sense code
    Jan  3 09:49:40 HOST kernel: [2272121.968706] sd 1:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
    Jan  3 09:49:40 HOST kernel: [2272121.968714] sd 1:0:0:0: [sda] Sense Key : Medium Error [current] [descriptor]
    Jan  3 09:49:40 HOST kernel: [2272121.968723] Descriptor sense data with sense descriptors (in hex):
    Jan  3 09:49:40 HOST kernel: [2272121.968729]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
    Jan  3 09:49:40 HOST kernel: [2272121.968743]         35 f1 7f 78
    Jan  3 09:49:40 HOST kernel: [2272121.968749] sd 1:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate failed
    Jan  3 09:49:40 HOST kernel: [2272121.968759] sd 1:0:0:0: [sda] CDB: Read(10): 28 00 35 f1 7f 78 00 00 38 00
    Jan  3 09:49:40 HOST kernel: [2272121.968778] ata2: EH complete
Jan  3 09:47:45 HOST kernel: [2272007.394223]  [<ffffffffa00c9638>] __ext4_journal_get_write_access+0x38/0x80 [ext4]
Jan  3 09:47:45 HOST kernel: [2272007.394232]  [<ffffffffa00a0563>] ext4_reserve_inode_write+0x73/0xa0 [ext4]
Jan  3 09:47:45 HOST kernel: [2272007.394241]  [<ffffffffa00a05dc>] ext4_mark_inode_dirty+0x4c/0x1d0 [ext4]
Jan  3 09:47:45 HOST kernel: [2272007.394253]  [<ffffffffa00d593c>] ? ext4_xattr_get+0x10c/0x2c0 [ext4]
Jan  3 09:47:45 HOST kernel: [2272007.394262]  [<ffffffffa00a08d0>] ext4_dirty_inode+0x40/0x60 [ext4]
Jan  3 09:47:45 HOST kernel: [2272007.394266]  [<ffffffff811c7b2b>] __mark_inode_dirty+0x3b/0x160
Jan  3 09:47:45 HOST kernel: [2272007.394270]  [<ffffffff811b792a>] file_update_time+0x10a/0x1a0
Jan  3 09:47:45 HOST kernel: [2272007.394274]  [<ffffffff8112ac6c>] __generic_file_write_iter+0x1fc/0x420
Jan  3 09:47:45 HOST kernel: [2272007.394278]  [<ffffffff81127571>] ? file_read_iter_actor+0x61/0x80
Jan  3 09:47:45 HOST kernel: [2272007.394282]  [<ffffffff8112af15>] __generic_file_aio_write+0x85/0xa0
Jan  3 09:47:45 HOST kernel: [2272007.394287]  [<ffffffff8112af9f>] generic_file_aio_write+0x6f/0xe0
Jan  3 09:47:45 HOST kernel: [2272007.394295]  [<ffffffffa009a331>] ext4_file_write+0x61/0x1e0 [ext4]
Jan  3 09:47:45 HOST kernel: [2272007.394299]  [<ffffffff8119c78a>] do_sync_write+0xfa/0x140
Jan  3 09:47:45 HOST kernel: [2272007.394303]  [<ffffffff81097d70>] ? autoremove_wake_function+0x0/0x40
Jan  3 09:47:45 HOST kernel: [2272007.394307]  [<ffffffff8119ca68>] vfs_write+0xb8/0x1a0
Jan  3 09:47:45 HOST kernel: [2272007.394311]  [<ffffffff8119d542>] sys_pwrite64+0x82/0xa0
Jan  3 09:47:45 HOST kernel: [2272007.394315]  [<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
Jan  3 09:47:45 HOST kernel: [2272007.394319] INFO: task mysqld:1241 blocked for more than 120 seconds.
Jan  3 09:47:45 HOST kernel: [2272007.394389] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  3 09:47:45 HOST kernel: [2272007.394581] mysqld        D ffff88004dda2f40     0  1241   3454    0 0x00000000
Jan  3 09:47:45 HOST kernel: [2272007.394585]  ffff88007df63958 0000000000000082 0000000000000000 00000000ffffffff
Jan  3 09:47:45 HOST kernel: [2272007.394590]  ffff8800ffffffff 0000000000055c14 ffff88007df638e8 ffffffff8112806e
Jan  3 09:47:45 HOST kernel: [2272007.394594]  000000000001b900 ffff88004dda3508 ffff88007df63fd8 000000000001e9c0
Jan  3 09:47:45 HOST kernel: [2272007.394598] Call Trace:
Jan  3 09:47:45 HOST kernel: [2272007.394601]  [<ffffffff8112806e>] ? find_get_page+0x1e/0xa0
Jan  3 09:47:45 HOST kernel: [2272007.394608]  [<ffffffffa006d0bd>] do_get_write_access+0x29d/0x510 [jbd2]
Jan  3 09:47:45 HOST kernel: [2272007.394612]  [<ffffffff81097db0>] ? wake_bit_function+0x0/0x50
Jan  3 09:47:45 HOST kernel: [2272007.394618]  [<ffffffffa006d481>] jbd2_journal_get_write_access+0x31/0x50 [jbd2]
Jan  3 09:47:45 HOST kernel: [2272007.394629]  [<ffffffffa00c9638>] __ext4_journal_get_write_access+0x38/0x80 [ext4]
Jan  3 09:47:45 HOST kernel: [2272007.394643]  [<ffffffffa00a0563>] ext4_reserve_inode_write+0x73/0xa0 [ext4]
Jan  3 09:47:45 HOST kernel: [2272007.394653]  [<ffffffffa00a05dc>] ext4_mark_inode_dirty+0x4c/0x1d0 [ext4]
Jan  3 09:47:45 HOST kernel: [2272007.394664]  [<ffffffffa00d593c>] ? ext4_xattr_get+0x10c/0x2c0 [ext4]
Jan  3 09:47:45 HOST kernel: [2272007.394677]  [<ffffffffa00a08d0>] ext4_dirty_inode+0x40/0x60 [ext4]
Jan  3 09:47:45 HOST kernel: [2272007.394683]  [<ffffffff811c7b2b>] __mark_inode_dirty+0x3b/0x160
Jan  3 09:47:45 HOST kernel: [2272007.394690]  [<ffffffff811b792a>] file_update_time+0x10a/0x1a0
Jan  3 09:47:45 HOST kernel: [2272007.394697]  [<ffffffff8112ac6c>] __generic_file_write_iter+0x1fc/0x420
Jan  3 09:47:45 HOST kernel: [2272007.394704]  [<ffffffff81127571>] ? file_read_iter_actor+0x61/0x80
Jan  3 09:47:45 HOST kernel: [2272007.394712]  [<ffffffff8112af15>] __generic_file_aio_write+0x85/0xa0
Jan  3 09:47:45 HOST kernel: [2272007.394719]  [<ffffffff8112af9f>] generic_file_aio_write+0x6f/0xe0
Jan  3 09:47:45 HOST kernel: [2272007.394730]  [<ffffffffa009a331>] ext4_file_write+0x61/0x1e0 [ext4]
Jan  3 09:47:45 HOST kernel: [2272007.394738]  [<ffffffff8119c78a>] do_sync_write+0xfa/0x140
Jan  3 09:47:45 HOST kernel: [2272007.394744]  [<ffffffff81097d70>] ? autoremove_wake_function+0x0/0x40
Jan  3 09:47:45 HOST kernel: [2272007.394751]  [<ffffffff8119ca68>] vfs_write+0xb8/0x1a0
Jan  3 09:47:45 HOST kernel: [2272007.394757]  [<ffffffff8119d542>] sys_pwrite64+0x82/0xa0
Jan  3 09:47:45 HOST kernel: [2272007.394764]  [<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
Jan  3 09:47:52 HOST kernel: [2272013.885915] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Jan  3 09:47:52 HOST kernel: [2272013.885998] ata2.00: BMDMA stat 0x25

Поздравляю, у вас классический URE. В вашем сообщении об ошибке даже прямо говорится об этом.

    Jan  3 09:49:40 HOST kernel: [2272121.968749] sd 1:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate failed

Попросите центр обработки данных заменить неисправный диск.

Сначала вы должны сделать резервную копию данных. Это ближайший приоритет.

Жесткий диск точно плохой. Вы не можете получить ошибку DRDY, маску исключения, ошибки ключа распознавания SCSI одновременно. Все это указывает на одно: жесткий диск выходит из строя.

Теперь посмотрим на трассировку вызовов. Это показывает, что ext4 получил индексный дескриптор, получил данные, загрязнил индексный дескриптор, но не может писать в него. Подождите, и вы рискуете получить файловую систему только для чтения. Не запускайте fsck, пока не выполните резервное копирование.

И когда вы размонтируете жесткий диск и запустите fsck, попробуйте запустить подробный режим.

fsck -fyv <partition-name>

Если вы можете записать ошибки, это может пригодиться в следующий раз, если проблема возникнет снова.

Я вижу несколько сообщений «DRDY ERR», которые просто связаны с отказом жесткого диска. Ты бежал fsck -cc найти битые сектора и пометить их?

Заметка: убедитесь, что вы загружаетесь в другую ОС, поскольку вам не следует запускать fsck на смонтированном разделе. И резервная копия резервной копии!