Назад | Перейти на главную страницу

Некоторое время глохнет доступ к сети на установке centos7 + zfs + samba

У меня есть сервер хранения с CentOS7 с 4 жесткими дисками, работающими на raidz (Raid 5). Я получаю доступ к файлам на нем через самбу.

Когда я копирую с него файлы или смотрю фильмы, иногда чтение из самбы блокируется на 10 секунд, а затем продолжается. Понятия не имею, как часто он блокируется, но примерно раз в 5-10 минут. Это может быть один раз на каждые X МБ прочитанных данных ...

Тот же компьютер отлично работал на CentOS6 с программным рейдом.

в / var / log / messages в то время, когда я обращался к файлам, ничего нет

Статус ZFS (раньше я запускал scrub вручную и ошибок не обнаружил):

# zpool status -v
  pool: backup
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
        still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(5) for details.
  scan: scrub repaired 0B in 2h45m with 0 errors on Sun Nov 12 08:15:09 2017
config:

        NAME        STATE     READ WRITE CKSUM
        backup      ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sda6    ONLINE       0     0     0
            sdb6    ONLINE       0     0     0
        cache
          sde3      ONLINE       0     0     0

errors: No known data errors

  pool: storage
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        storage     ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            sda5    ONLINE       0     0     0
            sdb5    ONLINE       0     0     0
            sdc5    ONLINE       0     0     0
            sdd5    ONLINE       0     0     0
        cache
          sde2      ONLINE       0     0     0

errors: No known data errors

в smbd.log вижу только (широкие ссылки отключил):

[2018/02/13 10:08:59.532259,  0] ../source3/param/loadparm.c:4485(widelinks_warning)
  Share 'storage' has wide links and unix extensions enabled. These parameters are incompatible. Wide links will be disabled for this share.

Я заметил эти сообщения в dmesg (может не иметь ничего общего с киосками):

[ 1247.680530] perf: interrupt took too long (2513 > 2500), lowering kernel.perf_event_max_sample_rate to 79000
[ 1551.176577] perf: interrupt took too long (3145 > 3141), lowering kernel.perf_event_max_sample_rate to 63000
[ 5137.178646] perf: interrupt took too long (3970 > 3931), lowering kernel.perf_event_max_sample_rate to 50000
[ 5231.736533] perf: interrupt took too long (4969 > 4962), lowering kernel.perf_event_max_sample_rate to 40000
[ 5824.261569] perf: interrupt took too long (6215 > 6211), lowering kernel.perf_event_max_sample_rate to 32000
[ 7051.322619] perf: interrupt took too long (7783 > 7768), lowering kernel.perf_event_max_sample_rate to 25000

Обновить

Вот некоторая информация о smartctl:

/dev/sda

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%       528         -

/dev/sdb

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     39047         -

/dev/sdc

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     19547         -

/dev/sdd

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     22117         -
# 2  Short offline       Completed without error       00%      4022         -

Общее состояние для всех дисков чтения "ПРОШЛО"

Нет записей в журнале ошибок.

Вот умные атрибуты:

/dev/sda

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   253   253   021    Pre-fail  Always       -       8975
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       74
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       528
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       74
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       8
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       65
194 Temperature_Celsius     0x0022   121   090   000    Old_age   Always       -       31
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

/dev/sdb 

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       3
  3 Spin_Up_Time            0x0027   253   253   021    Pre-fail  Always       -       8441
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       149
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   047   047   000    Old_age   Always       -       39048
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       147
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       63
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       86
194 Temperature_Celsius     0x0022   120   100   000    Old_age   Always       -       32
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

/dev/sdc

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   171   171   021    Pre-fail  Always       -       4416
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       130
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   074   074   000    Old_age   Always       -       19549
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       130
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       47
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       97
194 Temperature_Celsius     0x0022   117   101   000    Old_age   Always       -       30
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       3

/dev/sdd

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   172   172   021    Pre-fail  Always       -       4366
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       149
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   070   070   000    Old_age   Always       -       22120
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       149
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       61
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       147
194 Temperature_Celsius     0x0022   114   099   000    Old_age   Always       -       33
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0