SteelHeadā„¢ Deployment Guide : Optimization Techniques and Design Fundamentals : How SteelHeads Optimize Data
  
How SteelHeads Optimize Data
The SteelHead optimizes data in the following ways:
  • Data Streamlining
  • Transport Streamlining
  • Application Streamlining
  • Management Streamlining
  • The causes for slow throughput in WANs are well known: high delay (round-trip time or latency), limited bandwidth, and chatty application protocols. Large enterprises spend a significant portion of their information technology budgets on storage and networks. Much of it spent to compensate for slow throughput by deploying redundant servers and storage and the required backup equipment. SteelHeads enable you to consolidate and centralize key IT resources to save money, simplify key business processes, and improve productivity.
    RiOS is the software that powers the SteelHead and SteelCentral Controller for SteelHead Mobile. With RiOS, you can solve a range of problems affecting WANs and application performance, including:
  • insufficient WAN bandwidth.
  • inefficient transport protocols in high-latency environments.
  • inefficient application protocols in high-latency environments.
  • RiOS intercepts client/server connections without interfering with normal client/server interactions, file semantics, or protocols. All client requests are passed through to the server normally, although relevant traffic is optimized to improve performance.
    Data Streamlining
    With data streamlining, SteelHeads and SteelCentral Controller for SteelHead Mobile can reduce WAN bandwidth utilization by 65 to 98 percent for TCP-based applications. This section includes the following topics:
  • Scalable Data Referencing
  • Bidirectional Synchronized RiOS Data Store
  • Unified RiOS Data Store
  • Scalable Data Referencing
    In addition to traditional techniques like data compression, RiOS also uses a Riverbed proprietary algorithm called Scalable Data Referencing (SDR). RiOS SDR breaks up TCP data streams into unique data chunks that are stored on the hard disks (RiOS data store) of the device running RiOS (a SteelHead or SteelCentral Controller for SteelHead Mobile host system). Each data chunk is assigned a unique integer label (reference) before it is sent to a peer RiOS device across the WAN. When the same byte sequence occurs in future transmissions from clients or servers, the reference is sent across the WAN instead of the raw data chunk. The peer RiOS device (a SteelHead or SteelCentral Controller for SteelHead Mobile host system) uses this reference to find the original data chunk on its RiOS data store and reconstruct the original TCP data stream.
    Files and other data structures can be accelerated by data streamlining even when they are transferred using different applications. For example, a file that is initially transferred through CIFS is accelerated when it is transferred again through FTP.
    Applications that encode data in a different format when they transmit over the WAN can also be accelerated by data streamlining. For example, Microsoft Exchange uses the MAPI protocol to encode file attachments prior to sending them to Microsoft Outlook clients. As a part of its MAPI-specific optimized connections, the RiOS decodes the data before applying SDR. This decoding enables the SteelHead to recognize byte sequences in file attachments in their native form when the file is subsequently transferred through FTP or copied to a CIFS file share.
    Bidirectional Synchronized RiOS Data Store
    Data and references are maintained in persistent storage in the data store within each RiOS device and are stable across reboots and upgrades. To provide further longevity and safety, local SteelHead pairs optionally keep their data stores fully synchronized bidirectionally at all times. Bidirectional synchronization ensures that the failure of a single SteelHead does not force remote SteelHeads to send previously transmitted data chunks. This feature is especially useful when the local SteelHeads are deployed in a network cluster, such as a primary and backup deployment, a serial cluster, or a WCCP cluster.
    For information about primary and backup deployments, see Redundancy and Clustering. For information about serial cluster deployments, see Serial Cluster Deployments. For information about WCCP deployments, see WCCP Virtual In-Path Deployments.
    Unified RiOS Data Store
    A key Riverbed innovation is the unified data store that data streamlining uses to reduce bandwidth usage. After a data pattern is stored on the disk of a SteelHead or Mobile Controller peer, it can be leveraged for transfers to any other SteelHead or Mobile Controller peer, across all accelerated applications. Data is not duplicated within the RiOS data store, even if it is used in different applications, in different data transfer directions, or with new peers. The unified data store ensures that RiOS uses its disk space as efficiently as possible, even with thousands of remote SteelHeads or Mobile Controller peers.
    Transport Streamlining
    SteelHeads use a generic latency optimization technique called transport streamlining. This section includes the following topics:
  • Overview of Transport Streamlining
  • Connection Pooling
  • TCP Automatic Detection
  • SteelCentral Controller for SteelHead Mobile TCP Transport Modes
  • Tuning SteelHeads for High-Latency Links
  • TCP Algorithm Selection
  • WAN Buffers
  • You can find additional information about the transport streaming modes in QoS Configuration and Integration and Satellite Optimization.
    Overview of Transport Streamlining
    TCP connections suffer a lack of performance due to delay, loss, and other factors. There are many articles written about and other information available regarding how to choose the client and server TCP settings and TCP algorithms most appropriate for various environments. For example, without proper tuning, a TCP connection might never be able to fill the available bandwidth between two locations. You must consider the TCP window sizes used during the lifespan of a connection. If the TCP window size is not large enough, then the sender cannot consume the available bandwidth. You must also consider packet loss due to congestion or link quality.
     
    In many cases, packet loss is an indication to a TCP congestion avoidance algorithm that there is congestion, and congestion is a signal to the sender to slow down. The sender then can choose at which rate to slow down. The sender can:
  • undergo a multiplicative decrease: for example, send at one half the previous rate.
  • use other calculations to determine a slightly lower rate, just before the point at which congestion occurred.
  • A SteelHead deployed on the network can automate much of the manual analysis, research, and tuning necessary to achieve optimal performance, while providing you with options to fine tune. Collectively, these settings in the SteelHead are referred to as transport streamlining. The objective of transport streamlining is to mitigate the effects of WANs between client and server. Transport streamlining uses a set of standards-based and proprietary techniques to optimize TCP traffic between SteelHeads. These techniques:
  • ensure that efficient retransmission methods are used (such as TCP selective acknowledgments).
  • negotiate optimal TCP window sizes to minimize the impact of latency on throughput.
  • maximize throughput across a wide range of WAN links.
  • Additionally, a goal for selecting any TCP setting and congestion avoidance algorithm and using WAN optimization appliances is to find a balance between two extremes: acting fair and cooperative by sharing available bandwidth with coexisting flows on one end of the spectrum, or acting aggressive by trying to achieve maximum throughput at the expense of other flows on the opposite end of the spectrum. Being on the former end indicates that throughput suffers, and being on latter end indicates that your network is susceptible to congestion collapse.
    By default, the SteelHeads use standard TCP (as defined in RFC 793) to communicate between peers. This type of TCP algorithm is a loss-based algorithm that relies on the TCP algorithm to calculate the effective throughput for any given connection based on packet loss. Alternatively, you can configure SteelHeads to use a delay-based algorithm called bandwidth estimation. The purpose of bandwidth estimation is to calculate the rate, based on the delay of the link, to recover more gracefully in the presence of packet loss.
    In higher-throughput environments you can enable high-speed TCP (HS-TCP), which is a high-speed loss-based algorithm (as defined in RFC 3649) on the SteelHeads to achieve high throughput for links with high bandwidth and high latency. This TCP algorithm shifts toward the more aggressive side of the spectrum. Furthermore, you can shift even further toward the aggressive side of the spectrum, sacrificing fairness, by selecting the maximum TCP (MX-TCP) feature for traffic that you want to transmit over the WAN at a rate defined by the QoS class.
    Configuring MX-TCP through the QoS settings leverages QoS features to help protect other traffic and gives you the parameters, such as minimum and maximum percentages of the available bandwidth that TCP connections matching the class can consume. Although not appropriate for all environments, MX-TCP can maintain data transfer throughput in which adverse network conditions, such as abnormally high packet loss, impair performance. Data transfer is maintained without inserting error correction packets over the WAN through forward error correction (FEC). MX-TCP effectively handles packet loss without a decrease in throughput typically experienced with TCP.
    The TCP algorithms that rely on loss or delay calculations to determine the throughput should have an appropriate-sized buffer. You can configure the buffer size and choose the TCP algorithm in the Transport Settings page. The default buffer is 262,140 bytes, which should cover any connection of 20 Mbps or less with a round-trip delay up to 100 ms. This connection speed and round-trip delay composes most of branch office environments connecting to a data center or hub site.
     
    The following list is a high-level summary of each SteelHead TCP congestion avoidance algorithm:
  • Standard TCP - Standard TCP is a standards-based implementation of TCP and is the default setting in the SteelHead. Standard TCP is a WAN-friendly TCP stack and is not aggressive towards other traffic. Additionally, standard TCP benefits from the higher TCP WAN buffers, which are used by default for each connection between SteelHeads.
  • Bandwidth estimation - Bandwidth estimation is the delay-based algorithm that incorporates many of the features of standard TCP and includes calculation of RTT and bytes acknowledged. This additional calculation avoids the multiplicative decrease in rate detected in other TCP algorithms in the presence of packet loss. Bandwidth estimation is also appropriate for environments in which there is variable bandwidth and delay.
  • HighSpeed TCP (HS-TCP) - HS-TCP is efficient in long fat networks (LFNs) in which you have large WAN circuits (50 Mbps and above) over long distances. Typically, you use HS-TCP when you have a few long-lived replicated or backup flows. HS-TCP is designed for high-bandwidth and high-delay networks that have a low rate of packet loss due to corruption (bit errors). HS-TCP has a few advantages over standard TCP for LFNs. Standard TCP will backoff (slow down the transmission rate) in the presence of packet loss, causing connections to under use the bandwidth.
  • Also, standard TCP is not as aggressive during the TCP slow-start period to rapidly grow to the available bandwidth. HS-TCP uses a combination of calculations to rapidly fill the link and minimize backoff in the presence of loss. These techniques are documented in RFC 3649. HS-TCP is not beneficial for satellite links because the TCP congestion window recovery requires too many round trips or is too slow. HS-TCP requires that you adjust WAN buffers on the SteelHeads to be equal to 2 x BDP, where bandwidth-delay product (BDP) is the product of the WAN bandwidth and round-trip latency between locations. For more specific settings, see Storage Area Network Replication.
  • SkipWare Space Communications Protocol Standards (SCPS) per connection - SCPS per connection is for satellite links with little or no packet drops due to corruption. SCPS per connection requires a separate license.
  • For more details, see SCPS Per Connection.
  • SCPS error tolerance - SCPS error tolerance is for satellite links that have packet drops due to corruption. You must have a separate license to activate SCPS error tolerance.
  • For more details, see SCPS Error Tolerance.
    This is a high-level summary of additional modes that alter the SteelHead TCP congestion avoidance algorithm:
  • MX-TCP - MX-TCP is ideal for dedicated links, or to compensate for poor link quality (propagation issues, noise, and so on) or packet drops due to network congestion. The objective of MX-TCP is to achieve maximum TCP throughput. MX-TCP alters TCP by disabling the congestion control algorithm and sending traffic up to a rate you configure, regardless of link conditions. Additionally, MX-TCP can share any excess bandwidth with other QoS classes through adaptive MX-TCP. MX-TCP requires knowledge of the amount of bandwidth available for a given QoS class because, provided that enough traffic matches the QoS class, connections using MX-TCP attempt to consume the bandwidth allotted without regard to any other traffic.
  • For more details, see MX-TCP and MX-TCP Settings.
  • Rate pacing - Rate pacing is a combination of MX-TCP and a TCP congestion avoidance algorithm. You use rate pacing commonly in satellite environments, but you can use it in terrestrial connections as well. The combination of MX-TCP and a TCP congestion avoidance algorithm allows rate pacing to take the best from both features. Rate pacing leverages the rate configured for an MX-TCP QoS class to minimize buffer delays, but can adjust to the presence of loss due to network congestion.
  • For more details, see Configuring Rate Pacing.
    For additional information about transport streamlining mode options, see the following:
  • Connection Pooling
  • TCP Automatic Detection
  • Connection Pooling
    Connection pooling adds a benefit to transport streamlining by minimizing the time for an optimized connection to setup.
    Some application protocols, such as HTTP, use many rapidly created, short-lived TCP connections. To optimize these protocols, SteelHeads create pools of idle TCP connections. When a client tries to create a new connection to a previously visited server, the SteelHead uses a TCP connection from its pool of connections. Thus the client and the SteelHead do not have to wait for a three-way TCP handshake to finish across the WAN. This feature is called connection pooling. Connection pooling is available only for connections using the correct addressing WAN visibility mode.
    Transport streamlining ensures that there is always a one-to-one ratio for active TCP connections between SteelHeads and the TCP connections to clients and servers. Regardless of the WAN visibility mode in use, SteelHeads do not tunnel or perform multiplexing and demultiplexing of data across connections.
    For information about correct addressing modes, see WAN Visibility Modes. For information about HTTP optimization, see the SteelHead Deployment Guide - Protocols.
    TCP Automatic Detection
    One best practice you can consider for nearly every deployment is the TCP automatic detection feature on the data center SteelHeads. This feature allows the data center SteelHead to reflect the TCP algorithm in use by the peer. The benefit is that you can select the appropriate TCP algorithm for the remote branch office, and the data center SteelHead uses that TCP algorithm for connections. If SteelHeads on both sides of an optimized connection use the automatic detection feature, then standard TCP is used.
    SteelCentral Controller for SteelHead Mobile TCP Transport Modes
    This section briefly describes specific transport streamlining modes that operate with SteelCentral Controller for SteelHead Mobile.
    HS-TCP is not the best choice for interoperating in a SteelCentral Controller for SteelHead Mobile environment because it is designed for LFNs (high bandwidth and high delay). Essentially, the throughput is about equal to standard TCP.
    MX-TCP is a sender-side modification (configured on the server side) and is used to send data at a specified rate. When SteelCentral Controller for SteelHead Mobile is functioning on the receiving side, it can be difficult to deploy MX-TCP on the server side. The issue is defining a sending rate in which it might not be practical to determine the bandwidth that a client can receive on their mobile device because it is unknown and variable.
    Tuning SteelHeads for High-Latency Links
    Riverbed recommends that you gather WAN delay (commonly expressed as RTT), packet-loss rates, and link bandwidth to better understand the WAN characteristics so that you can make adjustments to the default transport streamlining settings. Also, understanding the types of workloads (long-lived, high-throughput, client-to-server traffic, mobile, and so on) is valuable information for you to appropriately select the best transport streamlining settings.
    Specific settings for high-speed data replication are covered in Storage Area Network Replication. The settings described in this chapter approximate when you can adjust the transport streamlining settings to improve throughput.
    TCP Algorithm Selection
    The default SteelHead settings are appropriate in most deployment environments. Based on RTT, bandwidth, and loss, you can optionally choose different transport streamlining settings. A solid approach to selecting the TCP algorithm found on the Transport Settings page is to use the automatic detection feature (auto-detect) on the data center SteelHead. The benefit to automatic detection is that the data center SteelHead reflects the choice of TCP algorithm in use at the remote site. You select the TCP algorithm at the remote site based on WAN bandwidth, RTT, and loss.
    A general guideline is that any connection over 50 Mbps can benefit from using HS-TCP, unless connection is over satellite (delay greater than 500 ms). You can use MX-TCP for high data rates if the end-to-end bandwidth is known and dedicated.
    When you are factoring in loss at lower-speed circuits, consider using bandwidth estimation. When planning, consider when packet loss is greater than 0.1 percent. Typically, MPLS networks are below 0.1percent packet loss, while other communication networks can be higher. For any satellite connection, the appropriate choices are SCPS (if licensed) or bandwidth estimation.
    For specific implementation details, see Satellite Optimization.
    WAN Buffers
    After you select the TCP algorithm, another setting to consider is the WAN-send and WAN-receive buffers. You can use bandwidth and RTT to determine the BDP. BDP is a multiplication of bandwidth and RTT, and is commonly divided by 8 and expressed in bytes. To get better performance, the SteelHead as a TCP proxy typically uses two times BDP as its WAN-send and WAN-receive buffer. For asymmetry, you can have the WAN-send buffer reflect the bandwidth and delay in the transmit direction, while the WAN-receive buffer reflects the bandwidth and delay in the receive direction. Note that you do not have to adjust the buffer settings unless there is a relatively small number of connections and you want to consume most or all of the available WAN bandwidth.
    Application Streamlining
    You can apply application-specific optimization for specific application protocols. For SteelHeads using RiOS v6.0 or later, application streamlining includes (this does not include SteelHead DX):
  • CIFS for Windows and Mac clients (Windows file sharing, backup and replication, and other Windows-based applications).
  • MAPI, including encrypted email (Microsoft Exchange v 5.5, 2000, 2003, and 2007).
  • NFS v3 for UNIX file sharing.
  • TDS for Microsoft SQL Server.
  • HTTP.
  • HTTPS and SSL.
  • IMAP-over-SSL.
  • Oracle 9i, which comes with Oracle Applications 11i.
  • Oracle10gR2, which comes with Oracle E-Business Suite R12.
  • Lotus Notes v6.0 or later.
  • Encrypted Lotus Notes.
  • ICA Client Drive Mapping.
  • support for multi-stream ICA and multi-port ICA.
  • Protocol-specific optimization reduces the number of round trips over the WAN for common actions and help move through data obfuscation and encryption by:
  • opening and editing documents on remote file servers (CIFS).
  • sending and receiving attachments (MAPI and Lotus Notes).
  • viewing remote intranet sites (HTTP).
  • securely performing RiOS SDR for SSL-encrypted transmissions (HTTPS).
  • For more information about application streamlining, see the SteelHead Deployment Guide - Protocols.
    SteelHead DX is designed for data center-to-data center type of work flows and only supports storage blades and FTP. SteelHead DX does not support CIFS, MAPI, HTTP, and SSL blades.
    Management Streamlining
    Developed by Riverbed, management streamlining simplifies the deployment and management of RiOS devices. This includes both hardware and software:
  • Auto-Discovery Protocol - Auto-discovery enables SteelHeads and SteelCentral Controller for SteelHead Mobile to automatically find remote SteelHeads and begin to optimize traffic. Auto-discovery avoids the requirement of having to define lengthy and complex network configurations on SteelHeads. The auto-discovery process enables administrators to:
  • control and secure connections.
  • specify which traffic is to be optimized.
  • specify peers for optimization.
  • For more information, see Auto-Discovery Protocol.
    Due to its difference in the finger-printing algorithm, SteelHead DX-to-SteelHead DX auto-discovery is enforced, but not SteelHead DX-to-other SteelHead models.
  • SCC - The SCC enables new, remote SteelHeads to be automatically configured and monitored. It also provides a single view of the overall optimization benefit and health of the SteelHead network.
  • Mobile Controller - The Mobile Controller is the management appliance that you use to track the individual health and performance of each deployed software client and to manage enterprise client licensing. The Mobile Controller enables you to see who is connected, view their data reduction statistics, and perform support operations such as resetting connections, pulling logs, and automatically generating traces for troubleshooting. You can perform all of these management tasks without end-user input.
  • For more information, see SteelCentral Controller for SteelHead Mobile Deployments.