Я использую ансамбль zookeeper из 3 узлов на ubuntu 14.04. Заметив, что node1 не запущен, мне пришлось удалить снимки этого узла, и теперь сервер действительно запустился, но отказывается присоединиться к кластеру.
Это журнал от zookeeper:
2016-01-15 17:07:28,187 - INFO [main:QuorumPeerConfig@101] - Reading configuration from: /etc/zookeeper/conf/zoo.cfg
2016-01-15 17:07:28,189 - INFO [main:QuorumPeerConfig@334] - Defaulting to majority quorums
2016-01-15 17:07:28,191 - INFO [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 3
2016-01-15 17:07:28,191 - INFO [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 0
2016-01-15 17:07:28,192 - INFO [main:DatadirCleanupManager@101] - Purge task is not scheduled.
2016-01-15 17:07:28,198 - INFO [main:QuorumPeerMain@127] - Starting quorum peer
2016-01-15 17:07:28,204 - INFO [main:NIOServerCnxnFactory@94] - binding to port 0.0.0.0/0.0.0.0:2181
2016-01-15 17:07:28,213 - INFO [main:QuorumPeer@913] - tickTime set to 2000
2016-01-15 17:07:28,213 - INFO [main:QuorumPeer@933] - minSessionTimeout set to -1
2016-01-15 17:07:28,213 - INFO [main:QuorumPeer@944] - maxSessionTimeout set to -1
2016-01-15 17:07:28,213 - INFO [main:QuorumPeer@959] - initLimit set to 10
2016-01-15 17:07:28,220 - INFO [main:QuorumPeer@429] - currentEpoch not found! Creating with a reasonable default of 0. This should only happen when you are upgrading your installation
2016-01-15 17:07:28,223 - INFO [main:QuorumPeer@444] - acceptedEpoch not found! Creating with a reasonable default of 0. This should only happen when you are upgrading your installation
2016-01-15 17:07:28,228 - INFO [Thread-1:QuorumCnxManager$Listener@486] - My election bind port: 0.0.0.0/0.0.0.0:3888
2016-01-15 17:07:28,233 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer@670] - LOOKING
2016-01-15 17:07:28,234 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@740] - New election. My id = 1, proposed zxid=0x0
2016-01-15 17:07:28,235 - INFO [WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: 1 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 1 (n.sid), 0x0 (n.peerEPoch), LOOKING (my state)
2016-01-15 17:07:28,237 - INFO [WorkerSender[myid=1]:QuorumCnxManager@190] - Have smaller server identifier, so dropping the connection: (2, 1)
2016-01-15 17:07:28,237 - INFO [WorkerSender[myid=1]:QuorumCnxManager@190] - Have smaller server identifier, so dropping the connection: (3, 1)
2016-01-15 17:07:28,238 - INFO [zk1/10.0.0.101:3888:QuorumCnxManager$Listener@493] - Received connection request /10.0.0.103:48431
2016-01-15 17:07:28,239 - INFO [zk1/10.0.0.101:3888:QuorumCnxManager$Listener@493] - Received connection request /10.0.0.102:46486
2016-01-15 17:07:28,239 - INFO [WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: 3 (n.leader), 0x1500000072 (n.zxid), 0x20 (n.round), LEADING (n.state), 3 (n.sid), 0x15 (n.peerEPoch), LOOKING (my state)
2016-01-15 17:07:28,240 - INFO [WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: 3 (n.leader), 0x1500000072 (n.zxid), 0x20 (n.round), FOLLOWING (n.state), 2 (n.sid), 0x15 (n.peerEPoch), LOOKING (my state)
2016-01-15 17:07:28,240 - INFO [WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: 3 (n.leader), 0x1500000072 (n.zxid), 0x20 (n.round), LEADING (n.state), 3 (n.sid), 0x15 (n.peerEPoch), LOOKING (my state)
2016-01-15 17:07:28,240 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer@738] - FOLLOWING
2016-01-15 17:07:28,241 - INFO [WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: 3 (n.leader), 0x1500000072 (n.zxid), 0x20 (n.round), FOLLOWING (n.state), 2 (n.sid), 0x15 (n.peerEPoch), FOLLOWING (my state)
2016-01-15 17:07:28,242 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Learner@85] - TCP NoDelay set to: true
2016-01-15 17:07:28,246 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:zookeeper.version=3.4.5--1, built on 06/10/2013 17:26 GMT
2016-01-15 17:07:28,246 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:host.name=fx1.findix.com
2016-01-15 17:07:28,246 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:java.version=1.8.0_66
2016-01-15 17:07:28,246 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:java.vendor=Oracle Corporation
2016-01-15 17:07:28,246 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:java.home=/usr/lib/jvm/java-8-oracle/jre
2016-01-15 17:07:28,246 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:java.class.path=/etc/zookeeper/conf:/usr/share/java/jline.jar:/usr/share/java/log4j-1.2.jar:/usr/share/java/xercesImpl.jar:/usr/share/java/xmlParserAPIs.jar:/usr/share/java/netty.jar:/usr/share/java/slf4j-api.jar:/usr/share/java/slf4j-log4j12.jar:/usr/share/java/zookeeper.jar
2016-01-15 17:07:28,246 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2016-01-15 17:07:28,246 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:java.io.tmpdir=/tmp
2016-01-15 17:07:28,246 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:java.compiler=<NA>
2016-01-15 17:07:28,246 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:os.name=Linux
2016-01-15 17:07:28,246 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:os.arch=amd64
2016-01-15 17:07:28,246 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:os.version=3.19.0-43-generic
2016-01-15 17:07:28,246 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:user.name=zookeeper
2016-01-15 17:07:28,246 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:user.home=/var/lib/zookeeper
2016-01-15 17:07:28,246 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:user.dir=/
2016-01-15 17:07:28,247 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@162] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 datadir /var/lib/zookeeper/version-2 snapdir /var/lib/zookeeper/version-2
2016-01-15 17:07:28,247 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@63] - FOLLOWING - LEADER ELECTION TOOK - 13
2016-01-15 17:07:28,253 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Learner@325] - Getting a snapshot from leader
2016-01-15 17:07:28,262 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FileTxnSnapLog@240] - Snapshotting: 0x16000009b6 to /var/lib/zookeeper/version-2/snapshot.16000009b6
2016-01-15 17:07:48,724 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /10.0.0.101:58156
2016-01-15 17:07:48,761 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@821] - Processing stat command from /10.0.0.101:58156
2016-01-15 17:07:48,763 - INFO [Thread-2:NIOServerCnxn$StatCommand@655] - Stat command output
2016-01-15 17:07:48,763 - INFO [Thread-2:NIOServerCnxn@1001] - Closed socket connection for client /10.0.0.101:58156 (no session established for client)
2016-01-15 17:08:02,003 - WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@118] - Got zxid 0x16000009b7 expected 0x1
2016-01-15 17:08:02,003 - INFO [SyncThread:1:FileTxnLog@199] - Creating new log file: log.16000009b7
Соответствующая запись выглядит так: Got zxid 0x16000009b7 expected 0x1
Похоже, что в версии 3.4.5 есть ошибки, и это было исправлено в 3.4.6, поэтому вопрос теперь в том, как применить патч или как обновить в Ubuntu 14.04.3?
Кажется, это ошибка в 3.4.5: http://zookeeper.apache.org/releases.html : В выпуске исправлена критическая ошибка, которая могла помешать серверу присоединиться к установленному ансамблю.