У меня система Suse12 с Intel 82599ES nic (с 2 * 10-гигабитными портами SFI / SFP +), два порта скреплены lacp. Недавно сеть системы недоступна, длилась 3 минуты.
2019-03-03T09:23:10.491731+08:00 oradb12 kernel: [9519285.192448] ixgbe 0000:02:00.1 eth5: initiating reset due to tx timeout
2019-03-03T09:23:10.491754+08:00 oradb12 kernel: [9519285.192464] ixgbe 0000:02:00.1 eth5: Reset adapter
2019-03-03T09:23:16.995739+08:00 oradb12 kernel: [9519291.696952] ixgbe 0000:02:00.1 eth5: speed changed to 0 for port eth5
2019-03-03T09:23:16.995763+08:00 oradb12 kernel: [9519291.697438] bond1: link status definitely down for interface eth5, disabling it
Linux oradb12 4.4.74-92.35-default #1 SMP Mon Aug 7 18:24:48 UTC 2017 (c0fdc47) x86_64 x86_64 x86_64 GNU/Linux
oradb12:/etc/sysconfig/network # cat /etc/SuSE-release
SUSE Linux Enterprise Server 12 (x86_64)
VERSION = 12
PATCHLEVEL = 2
oradb12:/etc/sysconfig/network # cat ifcfg-bond1
BOOTPROTO='static'
STARTMODE='onboot'
BONDING_MASTER='yes'
BONDING_SLAVE0='eth3'
BONDING_SLAVE1='eth5'
IPADDR=10.252.128.2
GATEWAY=10.252.128.1
NETMASK=255.255.255.0
USERCONTROL='no'
BONDING_MODULE_OPTS='mode=4 miimon=100 use_carrier=1'
oradb12:/etc/sysconfig/network # cat ifcfg-eth3
NAME='bond1-slave-eth3'
TYPE='Ethernet'
BOOTPROTO='none'
STARTMODE='onboot'
MASTER='bond1'
SLAVE='yes'
USERCONTROL='no'
oradb12:/etc/sysconfig/network # cat ifcfg-eth5
NAME='bond1-slave-eth5'
TYPE='Ethernet'
BOOTPROTO='none'
STARTMODE='onboot'
MASTER='bond1'
SLAVE='yes'
USERCONTROL='no'
oradb12:/etc/sysconfig/network # cat /proc/net/bonding/bond1
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
802.3ad info
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
System priority: 65535
System MAC address: 48:fd:8e:c9:21:64
Active Aggregator Info:
Aggregator ID: 1
Number of ports: 2
Actor Key: 13
Partner Key: 10273
Partner Mac Address: 74:4a:a4:08:ea:14
Slave Interface: eth3
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: 48:fd:8e:c9:21:64
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: none
Partner Churn State: none
Actor Churned Count: 0
Partner Churned Count: 0
details actor lacp pdu:
system priority: 65535
system mac address: 48:fd:8e:c9:21:64
port key: 13
port priority: 255
port number: 1
port state: 61
details partner lacp pdu:
system priority: 32768
system mac address: 74:4a:a4:08:ea:14
oper key: 10273
port priority: 32768
port number: 33
port state: 61
Slave Interface: eth5
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 24
Permanent HW addr: 48:fd:8e:c9:21:65
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: none
Partner Churn State: none
Actor Churned Count: 0
Partner Churned Count: 0
details actor lacp pdu:
system priority: 65535
system mac address: 48:fd:8e:c9:21:64
port key: 13
port priority: 255
port number: 2
port state: 61
details partner lacp pdu:
system priority: 32768
system mac address: 74:4a:a4:08:ea:14
oper key: 10273
port priority: 32768
port number: 87
port state: 61
oradb12:/etc/sysconfig/network # ethtool -i eth3
driver: ixgbe
version: 4.2.1-k
firmware-version: 0x800003df
expansion-rom-version:
bus-info: 0000:02:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no
oradb12:/etc/sysconfig/network # ethtool -i eth5
driver: ixgbe
version: 4.2.1-k
firmware-version: 0x800003df
expansion-rom-version:
bus-info: 0000:02:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no
Когда сетевой интерфейс выходит из строя, перезапуск сетевой службы на сервере, запуск службы перезапуска сети, кажется, устраняет проблемы
Мне было интересно, испытывал ли кто-нибудь подобные проблемы раньше и есть ли какие-либо предложения по отладке причины чего-то вроде этого?