Configuring High Availability on SteelHead SD
This topic describes how to configure high availability (HA) on SteelHead SD 2.0. It includes these sections:
Overview of HA on SteelHead SD
Prerequisites
Configuring a SteelHead SD HA pair
Monitoring a high-availability pair
Troubleshooting
Previous versions of SteelHead SD supported an active-passive HA scheme. You can’t upgrade your SteelHead SD 1.0 (SCM 2.10) HA seamlessly to SteelHead SD 2.12 HA. You must first manually unpair your master and backup appliances in SCM, upgrade from SteelHead SD 1.0 (SCM 2.10) to SteelConnect 2.12, and reconfigure HA in SCM.
Overview of HA on SteelHead SD
SteelHead SD provides active-active HA for 570-SD, 770-SD, and 3070-SD appliances.
SteelConnect 2.12 provides active-active HA for SteelConnect SDI-2030 appliances located at the data center.
With active-active HA support, when a fault is detected, traffic is immediately routed to the peer appliance so that both appliances function in tandem. Traffic can be sent over any uplink regardless of the role assigned to the SteelHead SD appliance (that is, master or backup appliance). Active-active HA simplifies the configuration of uplinks for the HA pair of appliances.
Active-active HA deployment at the branch shows an example of a symmetric deployment where the SteelHead SD HA pair are both connected to WAN 1 and WAN 2 via four uplinks.
Active-active HA deployment at the branch
SteelHead SD also supports asymmetric HA deployments.
Asymmetric HA deployment
SteelHead SD includes these HA features:
Symmetric and asymmetric connectivity.
Layer 2 (L2) and Layer 3 (L3) LAN topologies.
OSPF and BGP where SteelHead SD can peer with a router.
Backup standby HA link for LAN port if the AUX port is unreachable.
Master role under Appliances > Appliances Overview: HA tab.
Mixed-mode HA, where one SteelHead SD appliance is licensed for SD-WAN-only mode and the peer SteelHead SD appliance is licensed for SD-WAN and WAN optimization modes.
Dedicated HA link for the SteelHead SD HA pair so that the peer appliances operate as a single logical unit.
Autoconfiguration of the HA partner for bootstrapping when SCM connectivity with a peer is not accessible.
Integration with SCM health check for visibility and troubleshooting.
Zscaler support for HA deployments.
Symmetric and asymmetric uplink connectivity
SteelHead SD provides symmetric and asymmetric uplink connectivity:
Symmetric - In symmetric mode, each peer appliance is connected to all uplinks so that they essentially act as a single appliance. For example, you can have the 2 WAN uplinks connected to the peer appliances with four uplinks. Each uplink operates as a separate tunnel with separate IP addresses assigned to each uplink. If there is an uplink failure, the tunnel on that uplink goes down and the traffic is moved to the backup appliance. The 3070-SD supports up to 6 uplinks, where you can have 1 internet and 2 MPLS WAN uplinks for a total of 6 uplinks.
Asymmetric - In asymmetric mode, different WANs are connected to the peer appliances. If there is an appliance failure or a LAN-side fail over, the master appliance becomes to peer appliance.
Symmetric and asymmetric HA deployment examples at the branch
Layer 2 and Layer 3 support at the branch
With SteelHead SD 2.0, you can configure BGP and OSPF on the LAN branch.
You can configure iBGP between SteelHead SD HA peers if you want your overlay network to be advertised between the two appliances so that their routing tables are kept in synchronization. Also, you can have a combination of L2 and L3 zones so that if you have more than one LAN port configured, they can be a mix of L2 and L3. SteelHead SD uses iBGP between the peers to redistribute the overlay and connected routes.
LAN connectivity can be through either L2 switch domain or L3. In the case of a L3 LAN, connectivity is established through dynamic routing. SteelHead SD supports:
L3 LAN - You can redistribute static, connected, overlay, and WAN routes on both appliances in the HA pair. Your client traffic can go to either appliance in the HA pair. Using route convergence, the master processes the traffic and sends it on the overlay network.
L2 LAN - With L2, you can have a switch on the LAN-side connected to SteelHead SDs that have the same LAN zone with different IP address for each appliance. The system assigns a single virtual IP address (VIP) on the zone that is owned by the master appliance. All traffic goes to the master appliance where it sends it on the overlay network. If there is a failure, the VIP moves to the backup appliance where it becomes the new master.
Multigroup VIP and Virtual Router Redundancy Protocol (VRRP) with a third-party router are not supported at this time.
Failure conditions
SteelHead SD supports appliance, uplink, LAN, and dedicated port failure conditions. This list describes some typical use cases:
Appliance failure - For failures due to power, hardware, or VM failures, the master role is moved to the peer appliance. The VIP is moved to the new master appliance and L3 advertisements are stopped from the previous master appliance.
LAN failure - For an L2 LAN failure, the VIP moves to the backup appliance and MPLS connectivity is withdrawn. Traffic is sent through the backup appliance. For an L3 LAN failure, routing converges to send traffic to the backup appliance. Traffic is moved between appliances through the AUX port depending on which uplink the traffic needs to exit the HA pair.
AUX port failure - For an AUX port failure, you can configure a LAN-side HA standby link to avoid a split-brain scenarios if the AUX link goes down. The standby LAN HA link also provides connectivity via the LAN to SCM when a SteelHead SD appliance does not have an internet uplink. With a backup LAN HA link configured, when the AUX link fails, the HA traffic is switched to the LAN-side link. The AUX port is still the primary HA link, so that when the AUX link comes back up, traffic is switched back to the AUX link.
Prerequisites
Before configuring high availability, check these requirements and recommendations:
Both appliances must be running the same software version.
Both appliances must be cabled directly on the LAN branch using the AUX port.
In an L2 deployment, peer appliances must be located in the same zone of the branch network.
In an L3 deployment, two zones must be created, one for each appliance.
If you enable a standby LAN link for an AUX link failure, the standby HA LAN link must be part of a switched domain and a loopback IP address must be configured that is unique across all organizations.
If you have two high-availability (HA) appliances that have the same public IP, tunnels with the two HA appliances can’t be established, as they would appear identical. You must override the AutoVPN port to ensure tunnels between the two HA sites are established. For details, see Overview of AutoVPN on SteelHead SD
Configuring a SteelHead SD HA pair
These tasks assume that you have installed, registered, and performed the initial configuration of the SteelHead SD HA pair. You should create your branch site where the HA pair will be located, along with the associated zone and uplinks. For details, see the SteelConnect Manager User Guide. This section contains these topics:
Configuring the AUX port on the HA pair
Configuring the LAN zone for the SteelHead SD HA pair
Assigning the LAN zone to the SteelHead SD HA pair
Configuring the appliances in an HA pair
Configuring a standby LAN HA link
Configuring the AUX port on the HA pair
The first task is to configure the AUX port on the SteelHead SD HA pair. You will select the HA or Cluster mode for the port.
If you have two SteelHead SD appliances in HA mode, then the AUX port must be used for the interconnection so it will not be available as an additional WAN uplink.
To configure the AUX port on the master and backup SteelHead SD appliances
1. On the first appliance in the pair, choose Appliances > Ports and select the site from the Site drop-down list.
2. Under Appliances, select the appliance. The ports for the appliance are displayed.
3. Select the AUX port to expand the page.
4. Under Mode, select HA from the Port mode drop-down menu.
Configuring the AUX port on the HA pair
5. Click Submit.
6. Repeat Step 1 through Step 5 for the peer appliance in the HA pair.
After you have specified HA for the port mode, SCM displays this alert: HA Port active: This port has been configured to serve as a dedicated port for HA.
Configuring the LAN zone for the SteelHead SD HA pair
The next task is to configure the LAN zone for the SteelHead SD HA pair.
To configure the LAN zone
1. Choose Network Design > Zones.
2. Select the Zone for the appliance to expand the page.
3. Under IPv4 Network and IPv4 Gateway, specify the gateway IP address.
Configuring the LAN zone gateway
4. Click Submit.
5. For L3 LAN topologies, repeat Step 1 through Step 4 to create an additional zone.
Assigning the LAN zone to the SteelHead SD HA pair
After you configure the LAN zones, you must assign the LAN ports to the zones:
If the LAN-side network is L2, the same zone must to be attached to the LAN port on both appliances.
If the LAN-side network is L3, a different zone must be attached to the LAN port for each of the appliances.
To assign the LAN port to the zone
1. To assign the appliance port to the zone, choose Appliances > Ports.
2. Select the site from the Site list.
3. Select the LAN port to expand the pane.
Configuring the LAN port
4. Under Port Mode, select Singlezone or Multizone. If you select Singlezone, select the zone from the drop-down list.
5. Click Submit.
6. Repeat Step 1 through Step 5 for each appliance port that needs to be assigned to a zone.
Configuring the appliances in an HA pair
To configure the appliances into an HA pair
1. Choose Appliances and select the appliance.
2. Select the HA tab.
HA tab
3. Under High availability partner appliance, select the appliance that is in the branch.
4. Under Preferred HA Master, click On if you want this appliance to be the HA master.
5. Click Submit.
Once the two appliances are paired, you can see them negotiate their roles in the Appliances Overview page. The master and backup roles are assigned and appear for the paired appliances.
6. If you have a L2 zone in your network, click Configure Zone to configure the LAN interface IP addresses.
Configuring the LAN interfaces for L2 zones
7. Select the zone for the HA pair.
8. Enter the HA IP address for the current appliance.
9. Enter the HA IP address for the partner appliance.
10. Click Submit.
Configuring a standby LAN HA link
SteelConnect 2.12 supports one additional HA standby link. You can configure the LAN link as a backup HA link in case the AUX port is disconnected. If the AUX link goes down, you can use LAN-side connectivity to run the HA heartbeat, configure replication, and perform additional synchronization functions to avoid a split-brain HA condition.
With a standby LAN HA link configured, when the AUX link fails, the HA traffic is switched to the LAN-side. The AUX port is still the primary HA link, so that when the AUX link comes back up, traffic is switched back to the AUX link. The standby LAN HA link also provides connectivity via the LAN to SCM when a SteelHead SD appliance does not have an internet uplink.
The standby HA LAN link:
Must be part of a switched domain (that is, the master and backup HA appliances must be connected via a switch over the LAN).
A loopback IP address must configured on the master and backup HA appliances. The loopback IP address must be unique across the organization.
The loopback IP address must contain the zones belonging to the current site. It must be a /32 address and should not have a physical port attached to it.
Standby LAN HA link
When the AUX link is offline, all the HA traffic is switched to the LAN link. When the master wants to use the backup appliance uplinks, Generic Routing Encapsulation (GRE) tunnels the packet via the LAN link. When the backup appliance has a packet to send to the master appliance, it uses the LAN link for GRE encapsulated packets. When the AUX link comes back up, any further HA traffic uses the AUX link.
If a standby HA link is configured, a firmware download may fail if the AUX port or primary HA link goes down.
Configuring the loopback IP address
Configuring a standby LAN HA link requires that you configure a loopback IP address.
To configure a loopback IP address
1. Create a /32 zone belonging to the site. Do not attach any physical port to it. The /32 zone will appear under the Routing Loopback zone drop-down list.) For details on creating zones for sites, see the SteelConnect Manager User Guide.
2. Choose Appliances and select the appliance.
3. Select the Routing tab.
Configuring loopback IP
4. Select the loopback zone from the drop-down list. All the zones associated with the appliance are listed.
5. Specify the loopback IP address for the specified zone. The loopback IP address should not be same as the zone IP address.
6. Click Submit.
Configuring the standby LAN HA link
After you have configured the loopback IP address, you must specify a LAN link.
To configure the standby LAN HA link
1. Choose Appliances and select the appliance.
2. Select the HA tab.
Configuring the standby LAN HA link
3. Under Standby HA link configuration, select the standby LAN link from the drop-down list.
4. Click Submit.
After you submit your request, it is cascaded to the other HA appliance.
Monitoring a high-availability pair
SCM displays all appliances belonging to a high-availability pair with a blue HA icon in all views. After the appliance reports its HA state to SCM, the icon indicates whether it is the master or the backup.
When an HA appliance pair loses connectivity, Appliances and Health Check display both the master and backup appliance as HA Master. For SteelHead SD appliances, SCM will not display Offline for an appliance unless the appliance actually goes offline.
Uplink tracking and LAN port tracking is not available on SteelHead SD.
SCM manages both appliances in a pair as one. For example, under Appliances > Ports, if you view the ports for an HA pair, they appear together.
HA pair ports
To view appliance health of an HA pair
1. Choose Health Check > Appliance Health.
Appliance health in an HA pair
2. Select the appliance to expand the page.
Viewing HA pair health details
3. Under Hardware, click the plus sign to the left of High Availability to view:
current status
partner status
HA IP address
partner connectivity status
time since last failover
alive uplink count
Troubleshooting
Make sure the roles are displayed correctly on the appliances in the Appliances > Overview page.
All the tunnels must be up and should be using the uplinks for both the HA appliances.
If the appliance HA role is Unknown or if the appliance pair is listed as Master/Master, make sure the AUX port (that is, the dedicated HA port) is enabled and it is configured as HA mode. If the AUX port is configured and enabled, then collect a system dump from the appliances and contact Riverbed Support at https://support.riverbed.com.
The HA role is established with a daemon named keepalived. Search the logs for “keepalived” to debug HA issues.
Some useful CLI commands to analyze are:
show keepalived_resources
show ha_info
For details on the SteelHead SD CLI, see Using the CLI on SteelHead SD appliances in the SteelConnect Manager User Guide.