Edge HA peer communication
When you configure two Edges as active-standby peers for HA, they communicate with each other at regular intervals. The communication is required to ensure that the peers have their blockstores synchronized and that they are operating correctly based on their status (active or standby).
The blockstore synchronization happens through two network interfaces that you configure for this purpose on the Edge. Ideally, these are dedicated interfaces, preferably connected through crossover cables. Although not the preferred method, you can send blockstore synchronization traffic through other interfaces that are already being used for another purpose. If interfaces must be shared, use an interface that is more lightly loaded: for example, management traffic.
The interfaces used for the actual blockstore synchronization traffic are also used by each peer to check the status of one another through the heartbeat messages. The heartbeat messages provide each peer with the status of the other peer and can include peer configuration details.
A heartbeat message is sent by default every 3 seconds through TCP port 7972. If the peer fails to receive three successive heartbeat messages, then a failover event can be triggered. Because heartbeat messages are sent in both directions between Edge peers, there is a worst-case scenario in which failover can take up to 18 (3 x 3 x 2) seconds.
Failovers can also occur due to administrative intervention: for example, rebooting or powering off a Edge.
The blockstore synchronization traffic is sent between the peers using TCP port 7973. By default, the traffic uses the first of the two interfaces you configure. If the interface is not responding for some reason, the second interface is automatically used.
If neither interface is operational, then the Edge peers enter into some predetermined failover state based on the failure conditions.
The failover state on a Edge peer can be one of the following:
• Discover—Attempting to establish contact with the other peer.
• Active Sync—Actively serving client requests; the standby peer is in sync with the current state of the system.
• Standby Sync—Passively accepting updates from the active peer; in sync with the current state of the system.
• Active Degraded—Actively serving client requests; cannot contact the standby peer.
• Active Rebuild—Actively serving client requests; sending the standby peer updates that were missed during an outage.
• Standby Rebuild—Passively accepting updates from the active peer; not yet in sync with the state of the system.
For detailed information about how to configure two Edges as active-standby failover peers, the various failover states that each peer can assume while in an HA deployment, and the procedure required to remove an active-standby pair from that state, see the Edge user guide.