Назад | Перейти на главную страницу

Как определить, какой диск выходит из строя в RAID-контроллере HP ProLiant?

Мне сложно определить, какой диск выходит из строя в моем HP ProLiant DL360p Gen8. Он имеет следующий RAID-контроллер: Интеллектуальный массив P420i. Я вижу массу ошибок в dmesg:

[40425140.998750] sd 0:1:0:1: [sdb] Unaligned partial completion (resid=16312, sector_sz=512)
[40425140.998763] sd 0:1:0:1: [sdb] tag#597 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
[40425140.998767] sd 0:1:0:1: [sdb] tag#597 Sense Key : 0x3 [current] 
[40425140.998770] sd 0:1:0:1: [sdb] tag#597 ASC=0x11 ASCQ=0x0 
[40425140.998775] sd 0:1:0:1: [sdb] tag#597 CDB: opcode=0x88 88 00 00 00 00 00 3c 17 fa f8 00 00 00 08 00 00
[40425140.998778] print_req_error: critical medium error, dev sdb, sector 1008204536
[40425141.001176] sd 0:1:0:1: [sdb] Unaligned partial completion (resid=16312, sector_sz=512)
[40425141.001186] sd 0:1:0:1: [sdb] tag#597 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
[40425141.001189] sd 0:1:0:1: [sdb] tag#597 Sense Key : 0x3 [current] 
[40425141.001193] sd 0:1:0:1: [sdb] tag#597 ASC=0x11 ASCQ=0x0 
[40425141.001197] sd 0:1:0:1: [sdb] tag#597 CDB: opcode=0x88 88 00 00 00 00 00 3c 17 fa f8 00 00 00 08 00 00
[40425141.001199] print_req_error: critical medium error, dev sdb, sector 1008204536

Я думаю, они имеют в виду, что один из дисков в моем sdb RAID-массив выходит из строя. К сожалению, выполнение самотестирования по какой-то причине невозможно, и я не могу увидеть атрибуты SMART отдельных дисков. Вот результат работы smartctl:

sudo smartctl -a /dev/sdb -d cciss,2                                                                                          130 ↵
[sudo] password for hypervisor: 
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.14.12-2-bfq-mq] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               HITACHI
Product:              HUC10606 CLAR600
Revision:             C3B0
Compliance:           SPC-4
User Capacity:        600,000,000,000 bytes [600 GB]
Logical block size:   512 bytes
Rotation Rate:        10020 rpm
Form Factor:          2.5 inches
Logical Unit id:      0x5000cca03ca74bd0
Serial number:        PZJZ06RD
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Mon Dec 16 13:19:33 2019 CET
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Disabled or Not Supported

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Current Drive Temperature:     28 C
Drive Trip Temperature:        85 C

Manufactured in week 20 of year 2013
Specified cycle count over device lifetime:  50000
Accumulated start-stop cycles:  122
Specified load-unload count over device lifetime:  300000
Accumulated load-unload cycles:  2022
Elements in grown defect list: 0

Vendor (Seagate) cache information
  Blocks sent to initiator = 7447546408787247104

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:   2241169767  2284343         0  2243454110   16825755     105146.414           0
write:         0   112305         0    112305     112316      60991.348           0
verify: 722111517   138380         0  722249897      90047      25945.352           0

Non-medium error count:        0

No self-tests have been logged

Вот моя настройка массива:

sudo ssacli ctrl slot=0 pd all show detail

Smart Array P420i in Slot 0 (Embedded)

   Array A

      physicaldrive 1I:1:1
         Port: 1I
         Box: 1
         Bay: 1
         Status: OK
         Drive Type: Data Drive
         Interface Type: Solid State SATA
         Size: 500 GB
         Drive exposed to OS: False
         Logical/Physical Block Size: 512/512
         Firmware Revision: X61130WD
         Serial Number: 173566421696
         WWID: 3001438031683380
         Model: ATA     WDC WDS500G2B0A-
         SATA NCQ Capable: True
         SATA NCQ Enabled: True
         SSD Smart Trip Wearout: Not Supported
         PHY Count: 1
         PHY Transfer Rate: 6.0Gbps
         Drive Authentication Status: OK
         Carrier Application Version: 11
         Carrier Bootloader Version: 6
         Sanitize Erase Supported: True
         Unrestricted Sanitize Supported: True
         Shingled Magnetic Recording Support: None

      physicaldrive 1I:1:2
         Port: 1I
         Box: 1
         Bay: 2
         Status: OK
         Drive Type: Data Drive
         Interface Type: Solid State SATA
         Size: 500 GB
         Drive exposed to OS: False
         Logical/Physical Block Size: 512/512
         Firmware Revision: X61130WD
         Serial Number: 173566420063
         WWID: 3001438031683381
         Model: ATA     WDC WDS500G2B0A-
         SATA NCQ Capable: True
         SATA NCQ Enabled: True
         SSD Smart Trip Wearout: Not Supported
         PHY Count: 1
         PHY Transfer Rate: 6.0Gbps
         Drive Authentication Status: OK
         Carrier Application Version: 11
         Carrier Bootloader Version: 6
         Sanitize Erase Supported: True
         Unrestricted Sanitize Supported: True
         Shingled Magnetic Recording Support: None


   Array B

      physicaldrive 1I:1:3
         Port: 1I
         Box: 1
         Bay: 3
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 600 GB
         Drive exposed to OS: False
         Logical/Physical Block Size: 512/512
         Rotational Speed: 10000
         Firmware Revision: C3B0
         Serial Number: PZJZ06RD
         WWID: 5000CCA03CA74BD1
         Model: HITACHI HUC10606 CLAR600
         Current Temperature (C): 28
         PHY Count: 2
         PHY Transfer Rate: 6.0Gbps, Unknown
         Drive Authentication Status: OK
         Carrier Application Version: 11
         Carrier Bootloader Version: 6
         Sanitize Erase Supported: False
         Shingled Magnetic Recording Support: None

      physicaldrive 1I:1:4
         Port: 1I
         Box: 1
         Bay: 4
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 600 GB
         Drive exposed to OS: False
         Logical/Physical Block Size: 512/512
         Rotational Speed: 10000
         Firmware Revision: C3B0
         Serial Number: PZJE23KD
         WWID: 5000CCA03C887F15
         Model: HITACHI HUC10606 CLAR600
         Current Temperature (C): 30
         PHY Count: 2
         PHY Transfer Rate: 6.0Gbps, Unknown
         Drive Authentication Status: OK
         Carrier Application Version: 11
         Carrier Bootloader Version: 6
         Sanitize Erase Supported: False
         Shingled Magnetic Recording Support: None

      physicaldrive 2I:1:5
         Port: 2I
         Box: 1
         Bay: 5
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 600 GB
         Drive exposed to OS: False
         Logical/Physical Block Size: 512/512
         Rotational Speed: 10000
         Firmware Revision: C3B0
         Serial Number: PZJAL0JD
         WWID: 5000CCA03C83F969
         Model: HITACHI HUC10606 CLAR600
         Current Temperature (C): 27
         PHY Count: 2
         PHY Transfer Rate: 6.0Gbps, Unknown
         Drive Authentication Status: OK
         Carrier Application Version: 11
         Carrier Bootloader Version: 6
         Sanitize Erase Supported: False
         Shingled Magnetic Recording Support: None

      physicaldrive 2I:1:6
         Port: 2I
         Box: 1
         Bay: 6
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 600 GB
         Drive exposed to OS: False
         Logical/Physical Block Size: 512/512
         Rotational Speed: 10000
         Firmware Revision: C3B0
         Serial Number: PZK1EU5D
         WWID: 5000CCA03CABBAED
         Model: HITACHI HUC10606 CLAR600
         Current Temperature (C): 29
         PHY Count: 2
         PHY Transfer Rate: 6.0Gbps, Unknown
         Drive Authentication Status: OK
         Carrier Application Version: 11
         Carrier Bootloader Version: 6
         Sanitize Erase Supported: False
         Shingled Magnetic Recording Support: None

      physicaldrive 2I:1:7
         Port: 2I
         Box: 1
         Bay: 7
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 600 GB
         Drive exposed to OS: False
         Logical/Physical Block Size: 512/512
         Rotational Speed: 10000
         Firmware Revision: C3B0
         Serial Number: PZJJ37VD
         WWID: 5000CCA03C8E04A1
         Model: HITACHI HUC10606 CLAR600
         Current Temperature (C): 29
         PHY Count: 2
         PHY Transfer Rate: 6.0Gbps, Unknown
         Drive Authentication Status: OK
         Carrier Application Version: 11
         Carrier Bootloader Version: 6
         Sanitize Erase Supported: False
         Shingled Magnetic Recording Support: None

      physicaldrive 2I:1:8
         Port: 2I
         Box: 1
         Bay: 8
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 600 GB
         Drive exposed to OS: False
         Logical/Physical Block Size: 512/512
         Rotational Speed: 10000
         Firmware Revision: C3B0
         Serial Number: PZJAH46D
         WWID: 5000CCA03C83CE25
         Model: HITACHI HUC10606 CLAR600
         Current Temperature (C): 28
         PHY Count: 2
         PHY Transfer Rate: 6.0Gbps, Unknown
         Drive Authentication Status: OK
         Carrier Application Version: 11
         Carrier Bootloader Version: 6
         Sanitize Erase Supported: False
         Shingled Magnetic Recording Support: None

Есть ли способ определить, какой диск генерирует критические ошибки среды во втором массиве? Я не хочу заменять каждый диск, если только один из них выходит из строя ...