Edge best practices

At the remote branch office, separate storage traffic and WAN/Rdisk traffic from LAN traffic. This practice helps to increase overall security, minimize congestion, minimize latency, and simplify the overall configuration of your storage infrastructure.

In specific circumstances, you should pin the LUN and prepopulate the blockstore. Additionally, you can have the write-reserve space resized accordingly; by default, the Edge has a write-reserve space that is 10 percent of the blockstore.

To resize the write-reserve space, contact your Riverbed representative.

We recommend that you pin the LUN in the following circumstances:

• Unoptimized file systems—Core supports intelligent prefetch optimization on NFTS and VMFS file systems. For unoptimized file systems such as FAT, FAT32, ext3, and others. Core cannot perform optimization techniques such as prediction and prefetch in the same way as it does for NTFS and VMFS. For best results, pin the LUN and prepopulate the blockstore.

• WAN outages are likely or common—Ordinary operation of the product depends on WAN connectivity between the branch office and the data center. If WAN outages are likely or common, pin the LUN and prepopulate the blockstore. This is applicable to both LUNs that contain data as well those containing an operating system.

We recommend that you separate storage into three LUNs, as follows:

You should only allow iSCSI traffic on primary and auxiliary interfaces. We do not recommend that you configure your external iSCSI initiators to use the IP address configured on the in-path interface. Some appliance models can optionally support an additional NIC to provide extra network interfaces. You can also configure these interfaces to provide iSCSI connectivity.

If iSCSI port bindings are enabled on the on-board hypervisor of the Edge appliance, a Edge HA failover operation can take too long and time out. iSCSI port bindings are disabled by default. If these port bindings are enabled, we recommend that you remove the port bindings because the internal interconnect interfaces are all on different network segments.

For details, go to Knowledge Base article S28205.

When you have an Edge and ESXi running on the same converged platform, you must change IP addresses in a specific order to keep the task simple and fast. You can use this procedure when staging the Edges in the data center or moving them from one site to another.

This procedure assumes that the Edges are configured with IP addresses in a staged or production environment. You must test and verify all ESXi, servers, and interfaces before making these changes.

On a SteelHead EX, use virtual network computing (VNC) client to connect to the ESXi console, change the IP to the new destination IP address, and shut down ESXi from the console. If you did not configure VNC during the ESXi installation wizard, you may also use vSphere Client and change it from Configuration > Networking > rvbd_vswitch_pri > Properties.

You can specify the size of the local LUN during the hypervisor installation on the Edge. The installation wizard allows more flexible disk partitioning in which you can use a percentage of the exact amount in gigabytes that you want to use for the local LUN. The rest of the disk space is allocated to Edge blockstore. To streamline the ESXi configuration, run the hypervisor installer before connecting the Edge appliance to the Core to set up local storage. If local storage is configured during the hypervisor installation, all LUNs provisioned by the Core to the Edge are automatically made available to the ESXi of the Edge.

You can route Rdisk traffic out of the primary interface or the in-path interface.

We recommend that you select the primary interface when you deploy the Edge appliance. When you configure Edge to use the primary interface, the Rdisk traffic is sent unoptimized out of the primary interface to a switch or a router that in turn redirects the traffic back into the LAN interface of the Edge RiOS node to get optimized. The traffic is then sent out of the WAN interface toward the Core deployed at the data center. This configuration offers more redundancy because you can have both in-path interfaces connected to different switches.

Select the in-path interface when you deploy the Edge appliance. When you configure Edge to use the in-path interface, the Rdisk traffic is intercepted, optimized, and sent directly out of the WAN interface toward the Core deployed at the data center. Use this option during proof of concepts (POC) installations or if the primary interface is dedicated to management. The drawback of this mode is the lack of redundancy in the event of WAN interface failure. In this configuration, only the WAN interface needs to be connected. Disable link state propagation.

The Edges and Cores communicate with each other and transfer data-blocks over the WAN using six different TCP port numbers: 7950, 7951, 7952, 7953, 7954, and 7970.

Consider a deployment in which the remote branch and data center third-party optimization appliances are configured through WCCP. You can optionally configure WCCP redirect lists on the router to redirect traffic belonging to the six different TCP ports of the product to the SteelHeads. Configure a fixed-target rule for the six different TCP ports of the product to the in-path interface of the data center SteelHead.

This section describes different LUNs and storage layouts.

Transient and temporary server data is not required in the case of disaster recovery and therefore does not need to be replicated back to the data center. For this reason, we recommend that you separate transient and temporary data from the production data by implementing a layout that separates the two into multiple LUNs.

In general, plan to configure one LUN for the operating system, one LUN for the production data, and one LUN for the temporary swap or paging space. Configuring LUNs in this manner greatly enhances data protection and operations recovery in case of a disaster. This extra configuration also facilitates migration to server virtualization if you are using physical servers.

For more information about disaster recovery, see About Data Resilience and Security.

In order to achieve these goals, the product implements two types of LUNs: product-protected (iSCSI) LUNs and local LUNs. You can add LUNs by choosing Configure > Manage: LUNs.

Use product-protected LUNs to store production data. They share the space of the blockstore cache. The data is continuously replicated and kept in sync with the associated LUN back at the data center. The Edge cache only keeps the working set of data blocks for these LUNs. The remaining data is kept at the data center and predictably retrieved at the edge when needed. During WAN outages, edge servers are not guaranteed to operate and function at 100 percent because some of the data that is needed can be at the data center and not locally present in the Edge blockstore cache.

One particular type of product-protected LUN is the pinned LUN. Pinned LUNs are used to store production data but they use dedicated space in the Edge. The space required and dedicated in the blockstore cache is equal to the size of the LUN provisioned at the data center. The pinned LUN enables the edge servers to continue to operate and function during WAN outages because 100 percent of data is kept in blockstore cache. Like regular product LUNs the data is replicated and kept in sync with the associated LUN at the data center.

Use local LUNs to store transient and temporary data. Local LUNs also use dedicated space in the blockstore cache. The data is never replicated back to the data center because it is not required in the case of disaster recovery.

When deploying a physical Windows server, separate its storage into three different LUNs: the operating system and swap space (or page file) can reside in two partitions on the server internal hard drive (or two separate drives), while production data should reside on the product-protected LUN.

This layout facilitates future server virtualization and service recovery in the case of hardware failure at the remote branch. The production data is hosted on a product-protected LUN, which is safely stored and backed up at the data center. In case of a disaster, you can stream this data with little notice to a newly deployed Windows server without having to restore the entire dataset from backup.

When you deploy a virtual Windows server into an ESX infrastructure, you can also store the production data on an ESX datastore mapped to a product-protected LUN. This deployment facilitates service recovery in the event of hardware failure at the remote branch because product appliances optimize not only LUNs formatted directly with NTFS file system but also optimize LUNs that are first virtualized with VMFS and are later formatted with NTFS.

When you deploy VMFS datastores on product-protected LUNs, for best performance, choose the Thick Provision Lazy Zeroed disk format (VMware default). Because of the way we use blockstore in the Edge, this disk format is the most efficient option.

Thin provisioning is when you assign a LUN to be used by a device (in this case a VMFS datastore for an ESXi server, host) and you tell the host how big the LUN is (for example, 10 GiB). However, as an administrator you can choose to pretend that the LUN is 10 GiB, and only assign the host 2 GiB. This fake number is useful if you know that the host needs only 2 GiB to begin with. As time goes by (days or months), and the host starts to write more data and needs more space, the storage array automatically grows the LUN until eventually it really is 10 GiB in size.

Thick provisioning means there is no pretending. You allocate all 10 GiB from the beginning whether the host needs it from day one or not.

Whether you choose thick or thin provisioning, you need to initialize (format) the LUN in the same way as any other new disk. The formatting is essentially a process of writing a pattern to the disk sectors (in this case zeros). You cannot write to a disk before you format it. Normally, you have to wait for the entire disk to be formatted before you can use it—for large disks, this process can take hours. Lazy Zeroed means the process works away slowly in the background and as soon as the first few sectors have been formatted the host can start using it. This immediate usage means the host does not have to wait until the entire disk (LUN) is formatted.

VMware ESXi 5.5 and later support the vStorage APIs for Array Integration (VAAI) feature. This feature uses the SCSI WRITE SAME command when creating or using a vmdk. When using thin-provisioned vmdk files, ESXi creates new extents in the vmdk, by first writing binary 0s and then the block device (filesystem) data. When using thick provisioned vmdk files, ESXi creates all extents by writing binary 0s.

Versions prior to release 4.2 of Core and Edge software only supported 10-byte and 16-byte versions of the command. With release 4.2 and later, both the Core and Edge software support the use of SCSI WRITE SAME (32 byte) command. This support enables much faster provisioning and formatting of LUNs used for VMFS datastores.

Make iSCSI LUNs persistent across Windows server reboots; otherwise, you must manually reconnect them. To configure Windows servers to automatically connect to the iSCSI LUNs after system reboots, select the Add this connection to the list of Favorite Targets check box when you connect to the Edge iSCSI target.

To make iSCSI LUNs persistent and ensure that Windows does not consider the iSCSI service fully started until connections are restored to all the product volumes on the binding list, remember to add the Edge iSCSI target to the binding list of the iSCSI service. This addition is important particularly if you have data on an iSCSI LUN that other services depend on: for example, a Windows file server that is using the iSCSI LUN as a share.

The best option to do this process is to select the Volumes and Devices tab from the iSCSI Initiator's control panel and click Auto Configure. This action binds all available iSCSI targets to the iSCSI startup process. If you want to choose individual targets to bind, click Add. To add individual targets, you must know the target drive letter or mount point.

By default, VMware ESXi dynamically tries to reclaim unused memory from guest virtual machines, while the Windows operating system uses free memory to perform caching and avoid swapping to disk.

To significantly improve performance of Windows virtual machines, configure memory reservation to the highest possible value of the ESXi memory available to the VM. This configuration applies whether the VMs are hosted within the hypervisor node of the Edge or on an external ESXi server in the branch that is using LUNs from the product.

Setting the memory reservation to the configured size of the virtual machine results in a per virtual machine vmkernel swap file of zero bytes, which consumes less storage and helps to increase performance by eliminating ESXi host-level swapping. The guest operating system within the virtual machine maintains its own separate swap and page file.

If you are booting a Windows server or client from an unpinned iSCSI LUN, we recommend that you install the Riverbed Turbo Boot software on the Windows machine. The Riverbed Turbo Boot software greatly improves the boot process over the WAN performance because it allows Core to send to Edge only the files needed for the boot process.

There are two antivirus scanning modes:

There are two common locations to perform the scanning:

In typical product deployments in which the LUNs at the data center contain the full amount of data and the remote branch cache contains the working set, run on-demand scan mode at the data center and on-access scan mode at the remote branch. Running on-demand full file system scan mode at the remote branch causes the blockstore to wrap and evict the working set of data leading to slow performance results.

However, if the LUNs are pinned, on-demand full file system scan mode can also be performed at the remote branch.

Whether scanning on-host or off-host, the product solution does not dictate one way versus another, but in order to minimize the server load, we recommend off-host virus scans.

Disk defragmentation software is another category of software that can possibly cause the product blockstore cache to wrap and evict the working set of data. Do not run disk defragmentation software. Disable default-enabled disk defragmentation on Windows 7 and later.

Some recent versions of server operating systems (for example, Microsoft Windows Server 2012 and Windows Server 2016) include a feature that can be enabled to deduplicate data on the disks used by the server. While this might be considered a useful space-saving tool, it is not relevant when the server is running in a branch as part of an Edge deployment. The deduplication process tends to run once a day as a batch job and rearranges blocks in storage so that deduplication can be performed. This rearrangement of blocks could cause the product blockstore cache to wrap and evict the working set of data. It will also cause the rearranged blocks to be recognized by the Edge as changed and therefore be transferred needlessly across the WAN to the backend storage in the data center. Check to ensure that this feature is not enabled for servers hosted within Edge deployments. Most storage systems in modern data centers already have some form of deduplication capability, therefore this capability should be used instead if required.

Backup software is another category of software that will possibly cause the Edge blockstore cache to wrap and evict the working set of data, especially during the execution of full backups. In a product deployment, run differential, incremental, synthetic full, and a full backup at the data center.

If jumbo frames are supported by your network infrastructure, use jumbo frames between Core and storage arrays. We make the same recommendation for the Edge and any external application servers that are using LUNs from the Edge. The application server interfaces must support jumbo frames.

When a Core is removed from the Edge with the “preserve config” setting enabled, the Edge local LUNs are saved and the offline remote LUNs are removed from the Edge configuration. On the Core, there is no change, but the LUNs show as “Not connected.” In most scenarios, the reason for this procedure is replacement of the Core.

If a new Core is added to the Edge, the Edge local storage is merged from the Edge to the Core and normal operations are resumed.

However, if for some reason the same Core is added back, the Edge local storage information on the Core must be cleared from the Core by removing any entries for the specific Edge local LUNs, before the add operation is performed.

Once the same Core is added back again, the Edge local storage information, on the Edge, is merged to the Core configuration. At the same time, the remote offline storage information on the Core that is mapped to the Edge is merged across to the Edge.