Web Proxy
  
Web Proxy
This chapter describes the web proxy feature. Web proxy transparently intercepts HTTP traffic bound to the internet and provides acceleration services such as web caching, caching of eligible YouTube video content, Secure Sockets Layer (SSL) decryption and encryption services (for example, HTTPS) to enable encrypted content caching, and logging services through audit trails.
This chapter includes the following sections:
•  Overview of the web proxy feature
•  Web proxy fundamental properties
•  Supported features
•  Configuring basic web proxy features (HTTP)
•  Advanced configurations for web proxy
•  Troubleshooting web proxy
Overview of the web proxy feature
RiOS 9.1 and later include the web proxy feature. Web proxy uses the traditional single-ended internet HTTP-caching method enhanced by Riverbed for web browsing methodologies of today. The web proxy feature enables SteelHeads to provide a localized cache of web objects (or files). The localized cache alleviates the cost of repeated downloads of the same data. Using the web proxy feature in the branch office provides a significant overall performance increase due to the localized serving of this traffic and content from the cache. Furthermore, multiple users accessing the same resources receive content at LAN speeds while freeing up valuable bandwidth.
RiOS has embedded features such as strip-compression and parse-and-prefetch, which provide dual-ended SteelHead deployments with HTTP and HTTPS optimization abilities. For more information about strip-compression and parse-and-prefetch, see the SteelHead Deployment Guide - Protocols.
Riverbed enhanced the traditional HTTP caching feature on RiOS by including HTTPS data caching. Use of the web proxy feature differs from traditional SteelHead optimization because you do not need a server-side SteelHead for web proxy to intercept and optimize traffic. The web proxy feature is a true single-ended interception because you need only a branch-side SteelHead for accelerating HTTP-bound internet traffic.
Note: You can use web proxy while simultaneously optimizing different internal-bound traffic using any of the other available optimizations, such as the SteelHead HTTP optimization.
Web proxy fundamental properties
To use web proxy, you need the following appliances:
•  SCC - The SCC is used to configure, manage, and maintain the web proxy feature on each SteelHead on which you’ve web proxy. Additionally, you can use the SCC to centrally view and monitor the cache-hit data collected across sites in which you’ve deployed web proxy.
•  SteelHead - The SteelHead is usually located at the branch location and hosts the configurations created on the SCC. The SteelHead provides the proxy and cache services for each independent location.
The web proxy feature is currently only supported in a physical in-path deployment or a virtual in-path deployment (using WCCP or PBR) model. Web proxy is not supported on the xx50 models, xx60 models, or SteelHead-v.
Note: Web proxy is critically dependent on DNS resolution, specifically Reverse DNS lookups sourced from the Primary interface, for appropriate HTTP/HTTPS proxy services to occur. Because the SteelHead must successfully resolve hostnames to be cached and proxied the Primary interface of the SteelHead must be configured with valid IP address and DNS information. In addtion, the interface must be in an active state (even when it is not used by your supported deployment model). Make sure that the SteelHead DNS configuration and the Primary interface on the SteelHead are both configured and active.
You can deploy a basic web proxy running on the branch office SteelHead specifically as a transparent forward proxy. In this deployment the client connections have no knowledge of the existence of the proxy. Because of this implementation, the client machines do not require any additional configuration like a proxy auto-config (PAC) file addition or the need to change the gateway address to point at the SteelHead (or to configure a specific proxy server address in their browser).
Beginning in SCC and RiOS 9.5 Riverbed can now support Proxy Chaining configurations to additional upstream transparent (Manual mode) or explicit (Automatic mode) proxy services (for example, Squid, Zscaler, etc.). Alternative proxy functionality such as reverse proxy services (for example, many inbound connections being proxied to few data center hosts) are not supported.
The SteelHead houses a separate logical data store to hold cache data for the HTTP and HTTPS content that the web proxy caches. In SCC 9.2 and later, web proxy caching is RFC 2616 compliant and persistent in that the cache data services a SteelHead reboot as a server restart. While the total cache data store size varies based on the model of SteelHead you deploy, the maximum single cacheable file size for SCC 9.2 and later web proxy releases is set as unlimited. Unlimited means that a single cache-eligible file can be as large as the amount of available cache.
The basic configuration for web proxy is to enable the SCC for the web proxy service and then choose which supportable branch locations to enable web proxy on for the configuration update. You can additionally choose to enable HTTPS optimization and define a global whitelist of HTTPS domains that you can access from the HTTPS configured locations.
Note: HTTPS optimization assumes that you’ve configured the SCC for certificate authority (CA) service. For more information, see Configuring and Using the SCC as a Certificate Authority Service.
Supported features
The web proxy feature supports the following features:
•  IP addressing support
•  TCP port support
•  Web proxy and SteelHead SaaS
•  Video caching
IP addressing support
Web proxy, by default, supports proxying connections that use public IPv4 addresses (non-RFC 1918 IP addresses, or private reserved IP addresses). RFC 1918 addressing standard uses the SteelHead HTTP optimization and doesn’t service the web proxy feature without additional configuration. You can choose to create specific in-path rules to force RFC 1918 IP addresses to use the web proxy service. For more information, see Using in-path rules.
TCP port support
While TCP ports 80 (HTTP) and 443 (HTTPS) are the only ports supported by the web proxy default configurations, you can also configure nonstandard TCP ports supporting HTTP/HTTPS ports (8080/8000) to use web proxy. You configure these ports on the branch office SteelHead to bypass the HTTP optimization by creating additional in-path rules. For more information, see Using in-path rules.
Web proxy and SteelHead SaaS
SteelHead SaaS doesn’t use web proxy but instead uses the HTTP optimization within the SteelHead to access the external SaaS service infrastructure. Riverbed doesn’t recommend that you configure SteelHead SaaS traffic to specifically use the web proxy feature; however, both the web proxy feature and the SteelHead SaaS can coexist on the same SteelHead. For more information, see Using in-path rules.
Video caching
Some internet video services, specifically YouTube, can take advantage of the caching features of web proxy. Other cacheable video content that’s static in nature (for example, video on-demand training and other nonstreaming video services) are also potentially able to be cached as they are usually presented to the web proxy as a cache-eligible file.
YouTube video typically relies on HTTPS. Make sure that you configure the global whitelist to include entries for both *.YouTube.com and *.googlevideo.com. Most of the actual content for YouTube is housed under *.googlevideo.com.
For more information about the whitelist, see Using the global whitelist.
YouTube video content is automatically cached when using standard internet browsers, with the following exceptions:
•  Clients using Firefox can’t use the video caching features of web proxy. Firefox makes some header manipulations that the web proxy feature can’t identify.
•  Use of browsers in which users have selected to implement the use of the Google Quick UDP Internet Connections (QUIC) protocol multistreaming might have performance issues.
•  Mobile browsers haven’t been validated in this initial release of web proxy.
Configuring basic web proxy features (HTTP)
This section provides a simple overview about how to configure web proxy for HTTP traffic. For complete instructions, see the SteelCentral Controller for SteelHead User’s Guide. Web proxy caching for HTTP is automatically enabled when you enable the feature.
To configure web proxy for HTTP
1. From the SCC Management Console, choose Manage > Optimization: Web Proxy.
2. Select Enable Web Proxy.
3. Click Save.
Figure: Web proxy - global configuration
4. Select Site or Site Types.
5. Click Push to send the configuration to the SteelHead.
Figure: Web proxy push
Advanced configurations for web proxy
This section provides a detailed overview of the elements of web proxy and how to configure for HTTPS traffic optimization. It includes the following topics:
•  Configuring web proxy for HTTPS
•  SSL decryption and TCP proxy for HTTPS
•  Using web proxy and certificate management
•  Using the global whitelist
•  Using in-path rules
Configuring web proxy for HTTPS
This section describes a simple overview about how to configure web proxy for HTTPS traffic. For complete instructions, see the SteelCentral Controller for SteelHead User’s Guide.
To configure web proxy for HTTPS
1. From the SCC Management Console, choose Administration > Security: Certificate Authority.
Make sure that the certificate authority is enabled and that you’ve configured the associated key. For more information, see Using web proxy and certificate management.
2. Choose Manage > Optimization: Web Proxy.
3. Select Enable Web Proxy and Enable HTTPS Optimization in the Global Configuration box.
4. Click Save.
5. Under Global HTTPS Whitelist, select Add Domain and populate the required domains.
For more information, see Using the global whitelist.
6. Choose Manage > Optimization: Web Proxy.
7. Select the Site or Site Types.
8. Click Push to send the configuration to the SteelHeads.
9. From the SteelHead Management Console, choose Optimization > Network Services: In-Path Rules.
10. Configure the required in-path rules to support your implementation.
For more information, see Using in-path rules.
SSL decryption and TCP proxy for HTTPS
Web proxy can decrypt HTTPS traffic for TLSv1.0, v1.1, and v1.2 and additionally supports TCP proxying for SSLv3 connections. For web proxy to leverage the HTTP SSL decryption feature, the SSL handshake must include the Server Name Indication (SNI). For more information about SNI, see the SteelHead Deployment Guide - Protocols.
Note: To use the SSL caching features of web proxy for HTTPS traffic, you must make sure that the remote SteelHeads are running RiOS 9.1 or later with the SSL licensing installed.
Using web proxy and certificate management
This section describes some aspects that are important to consider regarding certificate management within your environment and its appropriate configuration for proper operation.
The web proxy HTTPS feature is critically dependent on the exchange of signed certificates between the SCC and the branch office SteelHead. Figure: Certificate workflow of the web proxy feature and the following steps show the certificate workflow of the web proxy feature.
Figure: Certificate workflow of the web proxy feature
The following steps correlate to the numbers in Figure: Certificate workflow of the web proxy feature:
1. The whitelist is manually configured with the approved domain information.
2. The approved whitelist domains are pushed to the client-side SteelHead web proxy configuration.
3. Web proxy automatically sends a certificate signing request for the approved domain to the certificate authority service, which is configured on the SCC.
4. The SCC certificate authority (CA) responds with appropriately signed server certificate.
5. Web proxy stores the server certificate and the associated license key in the SteelHead secure vault for use when a client requests the approved domain.
6. Web proxy and the CA service automatically renew the server certificate as required.
You must enable the CA service feature of the SCC to generate server certificates and decrypt authorized content before optimizing HTTPS traffic with web proxy. The SCC CA service certificates must be trusted by clients using the service. Figure: Certificate authority page shows the CA configuration option on the SCC Management Console under Administration > Security: Certificate Authority.
Figure: Certificate authority page
If you already have an existing private key and CA-signed public certificate, you can import them (in PEM format only) by cutting and pasting the certificate into the SCC CA Service configuration page.
If you do not already own certificates and keys, you can generate a private key and self-signed certificate through the SCC CA Service. Figure: PEM format shows an example on how to replace a self-signed trusted certificate under the CA.
Figure: PEM format
Select the PEM tab to view the certificate.
After you’ve configured the SCC CA service and have an SCC CA certificate created, we recommend that you follow your internal procedures to install the SCC CA certificate on your web client configurations as a trusted root certificate.
After you’ve configured the client-side SteelHead to support HTTPS web proxy, it automatically generates and renews the server certificates that the domain whitelist has allowed. Each client-side SteelHead contains its own secure vault and locally stores the generated keys and certificates within.
For more information, see Configuring and Using the SCC as a Certificate Authority Service and the SteelCentral Controller for SteelHead User’s Guide.
Using the global whitelist
You must configure the global HTTPS whitelist to contain the top-level and subdomain names for which the SCC permits the branch offices to proxy HTTPS. Choose Manage > Optimization: Web Proxy.
Be as specific as possible when you enter the whitelist domains; use the fully qualified domain name (FQDN) for each unique site requesting proxy service. In addition to using a specific FQDN, the whitelist accepts:
•  wildcard domains (for example,*.facebook.com, *.YouTube.com, *.Riverbed.com).
•  hostnames (for example, webserver.myinternaldomain.com).
Figure: Entering a name into the whitelist
Using parent proxy (proxy chaining) configurations
RiOS 9.5 introduces support for web proxy configurations requiring the integration into environments where additional proxy services reside upstream from the SteelHead—we refer to these upstream proxies as parent proxies. The web proxy service is able to now operate within a hierarchical chain of proxy servers that can pull content from the localized cache of each proxy up the chain—commonly referred to as proxy chaining.
There are two available configuration methods for using the parent proxy service within web proxy:
•  Manual mode - utilized when clients need to access content transparently with no knowledge of a proxy servers' existence.
•  Automatic mode - utilized when clients are required to explicitly access a specific proxy server as configured in the end-user browser or via a client PAC file locally.
Note: Parent proxy can only be deployed as a manual or automatic mode configuration but not as both configurations simultaneously.
Configuring manual mode parent proxy
To enable the manual parent proxy configuration options for HTTP and HTTPS first select Parent Proxy Configuration, then select the Manual radio button. Enter the upstream parent proxy server hostname, FQDN or IP address along with the specific server port in the following format:
<Parent Server>:<Service Port>
Figure: Manual mode illustrates how to configure manual mode for a parent proxy service utilizing both the HTTP and HTTPS schemes.
Figure: Manual mode
The ability to exclude specific domains from a manual mode parent proxy configuration is also available. This parent proxy exception option is only applicable to the manual mode server and not configurable for automatic mode. Figure: Entering the domain name exclusions illustrates how to enter domain name exclusions that should bypass the parent proxy configured.
Figure: Entering the domain name exclusions
The parent proxy used (when multiple are configured) is selected based upon a combination of the traffic scheme, which is limited to five parents per scheme (HTTP as opposed to HTTPS) and the operational mode selected. Failover mode is the configured default and selects the configured parent proxy in order of entry. Load-balanced mode enables parent proxies to be selected round-robin based on client IP hash. For either resiliency mode, if no configured parent is available in the requested scheme then the parent will be marked as down for a five minute interval and traffic for that scheme will be blackholed, that is dropped.
Configuring automatic mode parent proxy
To enable the automatic parent proxy option for HTTP and HTTPS first select Parent Proxy Configuration, then select the Automatic radio button. No additional configuration is required on the web proxy for default operation. Clients need to be configured with a PAC file or explicit browser configurations prior to enabling for correct operation. Figure: Automatic mode illustrates this selection in the SCC Management Console. By default all HTTPS-cache eligible content will be proxied and cached and HTTP traffic will be proxied. In order to cache HTTP content under automatic mode you need to manually add the parent proxy IP addresses via the CLI on the SteelHead using the following command:
[no] web-proxy parent automatic whitelist ip <IPv4 address of each parent proxy>
 
Figure: Automatic mode
Note: When configuring automatic or manual parent proxy modes, the SteelHead must trust the certificates issued by the parent proxy server or provider in order to properly proxy and cache HTTPS traffic when using the parent proxy configurations. For more information about adding Certificate Authorities for a proxy service on SteelHead, see the SteelHead Deployment Guide - Protocols.
Using in-path rules
Relative to web proxy, you configure in-path rules locally on the SteelHead, or you can alternatively configure them on the SCC and push them to the SteelHeads. These rules are used on the SteelHead to determine whether traffic has optimization applied or is passed through when the SteelHead detects a connection initiated by a client. The very basic implementation of web proxy (only enabled for public IP addresses using HTTP on TCP port 80) uses the default in-path rule. Figure: Default in-path rule on the SteelHead shows settings for the default in-path rule (bottom of Figure: Default in-path rule on the SteelHead) on the SteelHead. web proxy is set to Auto by default.
Figure: Default in-path rule on the SteelHead
For more information on in-path rules, see the SteelHead Management Console User’s Guide and the SteelHead Deployment Guide.
Consider the following selections when you configure the web proxy options in the in-path rule table:
•  Auto - All non-RFC 1918 IPv4 addresses on ports 80 and 443 matching this rule are forwarded to the web proxy service.
•  Force - Any IP address and port matching this rule (even those in RFC 1918) are forwarded to the web proxy service. The Force option is a pass-through rule.
•  None - Traffic matching this rule is not forwarded to the web proxy service.
Web proxy doesn’t leverage SteelHead transparency settings, although you can select them under Auto Discover and Pass Through In-Path rule configurations. If you select these options on the in-path rule, either the Auto or Force Web Proxy rule option, they are ignored. When web proxy is set to the automatic setting, eligible traffic is optimized using the web proxy feature; however, if the traffic can’t be optimized by the web proxy feature, then autodiscovery occurs and full transparency/port transparency options are preserved.
RFC 1918 (private IP address range) traffic bypasses the web proxy and uses the SteelHead HTTP optimization unless you add an in-path rule that specifies the RFC 1918 address you want to the web proxy to service. Configure this rule Type as Pass Through and select the Force option as shown in Figure: Example configuration of an in-path rule.
Figure: Example configuration of an in-path rule
To successfully optimize HTTPS connections, you must configure a new in-path rule for destination TCP port 443. You must configure this rule to negate the preconfigured secure content pass-through rule (secure PT rule) that ships with all SteelHeads. The secure PT rule takes precedence even with web proxy HTTPS optimization enabled. Figure: Adding a new rule shows a new rule added above the existing secure PT rule. Be sure to configure the rule with the Auto or Force Web proxy rule option while entering the destination port of 443.
Figure: Adding a new rule
SteelHead SaaS traffic to services such as Office365 and Salesforce.com bypass the web proxy feature and use the SteelHead SaaS HTTP optimization. Riverbed doesn’t recommend that you use an in-path rule with the Force option to proxy specific SteelHead SaaS-associated address.
In SCC 9.2 and later, you can choose to use the host and domain labels within the in-path rule configuration. These labels allow for increased flexibility when there’s a need to group IP addresses, IP ranges, FQDN, or wildcard domain names (for example, *.riverbed.com).
Use these guidelines when using host or domain labels with web proxy:
•  Implement domain and host labels to allow or deny web proxy serviceable HTTP/HTTPS traffic, based on your unique traffic needs. By default all TCP port 80 traffic is serviced by the web proxy through the default rule, and TCP port 443 always requires an inclusive in-path rule for desired domains.
•  The SSL whitelist is functioning as a CA encryption/decryption validator, not a HTTPS traffic policer. You need to create more than only a host and domain label or specific HTTPS domains or you might not receive the expected cached content.
•  We recommend that you order a domain label in-path rule low on the rule list, and configure that rule with a destination host label or focused destination IP range. This ordering ensures that the rule engine is not inappropriately conflicting with other rules, such as fixed-target rules or SaaS pass-through rules.
Fixed-target and SaaS pass-through rules can also leverage domain labels using the same matching domains. These fixed-target and SaaS pass-through rules might never be used because these rules occur below a higher-ordered domain label rule.
Troubleshooting web proxy
This section describes some of the common issues that may arise within the deployment and use of web proxy and outlines some procedures to validate correct configurations. It includes the following topics:
•  SCC to SteelHead communications
•  HTTP caching
•  HTTPS decryption
•  YouTube video caching
SCC to SteelHead communications
Because of the interdependency of both the SCC and the branch office SteelHeads, when you use the web proxy feature, you must validate that the two appliances can communicate with each other through their HTTPS channel. You can validate this command using the show scc command. This command shows the current state of both HTTPS connections to and from the respective SteelHead and the SCC that manages it.
HTTP caching
If the browser client is reporting Service Unavailable or if some traffic is not showing as a web proxy connection in the Current Connections report on the SteelHead, this report might be indicating that the web server is not accessible from the in-path interface of the client-side SteelHead. In either case, make sure that the in-path rules for those web services are correctly configured for each SteelHead requiring that access.
Additionally, make sure that web proxy service is enabled on the branch office SteelHead with the show web-proxy status command.
If you’re not seeing cache-hits register or increment, consider the following:
•  When looking at cache utilization on the SCC, understand that the SCC polling on cache utilization occurs in 60-minute intervals. Make sure that the caching is enabled with the show web-proxy cache status command.
•  Make sure that the cache-hit counters are incrementing on the SteelHead with the show web-proxy stats cache command.
•  Make sure that the proxy content you’re looking to validate as being cached is actually cacheable content as outlined in the RFC 2616 standard.
HTTPS decryption
If you observe that there’s no HTTPS content serviced by web proxy being optimized, immediately validate the following elements:
•  Check that HTTPS traffic has an in-path rule added on the SteelHead configured with web proxy for TCP port 443.
•  Verify that SSL has been properly enabled on the SteelHead with the following commands:
–  show web-proxy ssl
–  show web-proxy cache ssl
•  Make sure that the domain being proxied is in the HTTPS global whitelist configuration and that the configuration has been pushed to the SteelHead in question.
•  Make sure that the trusted CA certificate is the one that’s actually being presented to the client browser. You can check this certificate status by clicking on the lock icon within the URL field of most browsers.
•  If the SCC trusted certificate is not being seen on the client machines, you need to make sure that the SCC issued the certificate on the Certificate Authority page on the SCC Management Console.
YouTube video caching
If you’re not seeing any YouTube traffic being optimized in the Current Connections report on the SCC or the video playback quality is abnormally impaired, you can investigate the following reasons:
•  If you do not see the CA certificate being used for any other proxied websites, there might be an issue with traffic not being serviced by the web proxy (for example, the port 443 in-path rule is not configured for web proxy or SSL service is not running).
•  The correct domains might not be added to the global whitelist configuration on the SCC.
•  Because YouTube content can’t present certificate warnings, check other sites listed in the global white list for certificate warnings to make sure that there’s not a certificate issue.
•  You can try manually adding the CA certificate from the SCC directly to the client browser for test purposes.
If you believe that videos on YouTube aren’t being serviced from the cache, you can investigate the following reasons:
•  Validate that the global whitelist configuration on the SCC includes both *.youtube.com and *.googlevideo.com. The majority of YouTube content is served from the latter.
•  Follow the troubleshooting suggestions for both HTTP and HTTPS content caching to validate.