Назад | Перейти на главную страницу

Не удалось запустить службу etcd (поставить с неожиданной меньшей ревизией)

Введение

После недавнего отключения электроэнергии служба etcd на сервере RHEL7 не запускается.

подробности

Вот результат journalctl, относящийся к этой ошибке:

-- Subject: Unit etcd.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit etcd.service has begun starting up.
Aug 08 14:44:42 docker1.example.example etcd[10997]: recognized and used environment variable ETCD_ADVERTISE_CLIENT_URLS=https://10.0.0.51:2379
Aug 08 14:44:42 docker1.example.example etcd[10997]: recognized and used environment variable ETCD_CA_FILE=/etc/etcd/ca.crt
Aug 08 14:44:42 docker1.example.example etcd[10997]: recognized and used environment variable ETCD_CERT_FILE=/etc/etcd/server.crt
Aug 08 14:44:42 docker1.example.example etcd[10997]: recognized and used environment variable ETCD_DEBUG=False
Aug 08 14:44:42 docker1.example.example etcd[10997]: recognized and used environment variable ETCD_ELECTION_TIMEOUT=2500
Aug 08 14:44:42 docker1.example.example etcd[10997]: recognized and used environment variable ETCD_HEARTBEAT_INTERVAL=500
Aug 08 14:44:42 docker1.example.example etcd[10997]: recognized and used environment variable ETCD_INITexampleL_ADVERTISE_PEER_URLS=https://10.0.0.51:2380
Aug 08 14:44:42 docker1.example.example etcd[10997]: recognized and used environment variable ETCD_INITexampleL_CLUSTER=docker1.example.example=https://10.0.0.51:2380
Aug 08 14:44:42 docker1.example.example etcd[10997]: recognized and used environment variable ETCD_INITexampleL_CLUSTER_STATE=new
Aug 08 14:44:42 docker1.example.example etcd[10997]: recognized and used environment variable ETCD_INITexampleL_CLUSTER_TOKEN=etcd-cluster-1
Aug 08 14:44:42 docker1.example.example etcd[10997]: recognized and used environment variable ETCD_KEY_FILE=/etc/etcd/server.key
Aug 08 14:44:42 docker1.example.example etcd[10997]: recognized and used environment variable ETCD_LISTEN_PEER_URLS=https://10.0.0.51:2380
Aug 08 14:44:42 docker1.example.example etcd[10997]: recognized and used environment variable ETCD_PEER_CA_FILE=/etc/etcd/ca.crt
Aug 08 14:44:42 docker1.example.example etcd[10997]: recognized and used environment variable ETCD_PEER_CERT_FILE=/etc/etcd/peer.crt
Aug 08 14:44:42 docker1.example.example etcd[10997]: recognized and used environment variable ETCD_PEER_KEY_FILE=/etc/etcd/peer.key
Aug 08 14:44:42 docker1.example.example etcd[10997]: recognized environment variable ETCD_NAME, but unused: shadowed by corresponding flag 
Aug 08 14:44:42 docker1.example.example etcd[10997]: recognized environment variable ETCD_DATA_DIR, but unused: shadowed by corresponding flag 
Aug 08 14:44:42 docker1.example.example etcd[10997]: recognized environment variable ETCD_LISTEN_CLIENT_URLS, but unused: shadowed by corresponding flag 
Aug 08 14:44:42 docker1.example.example etcd[10997]: etcd Version: 3.2.5
Aug 08 14:44:42 docker1.example.example etcd[10997]: Git SHA: d0d1a87
Aug 08 14:44:42 docker1.example.example etcd[10997]: Go Version: go1.8.3
Aug 08 14:44:42 docker1.example.example etcd[10997]: Go OS/Arch: linux/amd64
Aug 08 14:44:42 docker1.example.example etcd[10997]: setting maximum number of CPUs to 12, total number of available CPUs is 12
Aug 08 14:44:42 docker1.example.example etcd[10997]: found invalid file/dir default.etcd under data dir /var/lib/etcd/ (Ignore this if you are upgrading etcd)
Aug 08 14:44:42 docker1.example.example etcd[10997]: found invalid file/dir etcd.etcd under data dir /var/lib/etcd/ (Ignore this if you are upgrading etcd)
Aug 08 14:44:42 docker1.example.example etcd[10997]: the server is already initialized as member before, starting as etcd member...
Aug 08 14:44:42 docker1.example.example etcd[10997]: peerTLS: cert = /etc/etcd/peer.crt, key = /etc/etcd/peer.key, ca = /etc/etcd/ca.crt, trusted-ca = , client-cert-auth = false
Aug 08 14:44:42 docker1.example.example etcd[10997]: listening for peers on https://10.0.0.51:2380
Aug 08 14:44:42 docker1.example.example etcd[10997]: listening for client requests on 10.0.0.51:2379
Aug 08 14:44:42 docker1.example.example etcd[10997]: recovered store from snapshot at index 31700317
Aug 08 14:44:42 docker1.example.example etcd[10997]: restore compact to 25716352
Aug 08 14:44:42 docker1.example.example etcd[10997]: store.keyindex: put with unexpected smaller revision [{25716268 0} / {25716732 0}]
Aug 08 14:44:42 docker1.example.example bash[10997]: panic: store.keyindex: put with unexpected smaller revision [{25716268 0} / {25716732 0}]
Aug 08 14:44:42 docker1.example.example bash[10997]: goroutine 94 [running]:
Aug 08 14:44:42 docker1.example.example bash[10997]: github.com/coreos/pkg/capnslog.(*PackageLogger).Panicf(0xc420156100, 0xf47183, 0x3e, 0xc420041cb0, 0x2, 0x2)
Aug 08 14:44:42 docker1.example.example bash[10997]: /builddir/build/BUILD/etcd-d0d1a87aa96ae14914751d42264262cb69eda170/Godeps/_workspace/src/github.com/coreos/pkg/capnslog/pkg_logger.go:75 +0x15c
Aug 08 14:44:42 docker1.example.example bash[10997]: github.com/coreos/etcd/mvcc.(*keyIndex).put(0xc42065a340, 0x188662c, 0x0)
Aug 08 14:44:42 docker1.example.example bash[10997]: /builddir/build/BUILD/etcd-d0d1a87aa96ae14914751d42264262cb69eda170/src/github.com/coreos/etcd/mvcc/key_index.go:80 +0x3ec
Aug 08 14:44:42 docker1.example.example bash[10997]: github.com/coreos/etcd/mvcc.restoreIntoIndex.func1(0xc42028cd90, 0xc4202e6060, 0x15cb8e0, 0xc4202cccc0)
Aug 08 14:44:42 docker1.example.example bash[10997]: /builddir/build/BUILD/etcd-d0d1a87aa96ae14914751d42264262cb69eda170/src/github.com/coreos/etcd/mvcc/kvstore.go:366 +0x3e3
Aug 08 14:44:42 docker1.example.example bash[10997]: created by github.com/coreos/etcd/mvcc.restoreIntoIndex
Aug 08 14:44:42 docker1.example.example bash[10997]: /builddir/build/BUILD/etcd-d0d1a87aa96ae14914751d42264262cb69eda170/src/github.com/coreos/etcd/mvcc/kvstore.go:373 +0xa5
Aug 08 14:44:42 docker1.example.example systemd[1]: etcd.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Aug 08 14:44:43 docker1.example.example systemd[1]: Failed to start Etcd Server.
-- Subject: Unit etcd.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit etcd.service has failed.
-- 
-- The result is failed.

Я считаю, что самая важная строка этого вывода:

store.keyindex: put with unexpected smaller revision [{25716268 0} / {25716732 0}]

Итак, я просмотрел исходный код etcd (версия ветки 3.2) и получил следующее:

// put puts a revision to the keyIndex.
func (ki *keyIndex) put(main int64, sub int64) {
    rev := revision{main: main, sub: sub}

    if !rev.GreaterThan(ki.modified) {
        plog.Panicf("store.keyindex: put with unexpected smaller revision [%v / %v]", rev, ki.modified)
    }
    if len(ki.generations) == 0 {
        ki.generations = append(ki.generations, generation{})
    }
    g := &ki.generations[len(ki.generations)-1]
    if len(g.revs) == 0 { // create a new key
        keysGauge.Inc()
        g.created = rev
    }
    g.revs = append(g.revs, rev)
    g.ver++
    ki.modified = rev
}

Я не уверен, как отследить это дальше.

Вопрос

Как мне заставить etcd нормально запускаться как сервис?