Policy Pages Reference : System Settings Policies : Alarms
  
Alarms
You can change alarm settings for the selected system settings policy in the Alarms page.
Enabling alarms is optional.
The SCC checks for alarms every 5 minutes.
Control
Description
Admission Control
Enables an alarm and sends an email notification if the SCC enters admission control. When this occurs, the SCC optimizes traffic beyond its rated capability and is unable to handle the amount of traffic passing through the WAN link. During this event, the SteelHead continues to optimize existing connections, but new connections are passed through without optimization.
•  Connection Limit - Indicates the system connection limit has been reached. Additional connections are passed through unoptimized. The alarm clears when the appliance moves out of this condition.
•  CPU - The appliance has entered admission control due to high CPU use. During this event, the appliance continues to optimize existing connections, but new connections are passed through without optimization. The alarm clears automatically when the CPU usage has decreased.
•  MAPI - The total number of MAPI optimized connections have exceeded the maximum admission control threshold. By default, the maximum admission control threshold is 85 percent of the total maximum optimized connection count for the client-side appliance. The appliance reserves the remaining 15 percent so that the MAPI admission control does not affect the other protocols. The 85 percent threshold is applied only to MAPI connections. RiOS is now passing through MAPI connections from new clients but continues to intercept and optimize MAPI connections from existing clients (including new MAPI connections from these clients). RiOS continues optimizing non-MAPI connections from all clients. The alarm clears automatically when the MAPI traffic has decreased; however, it can take one minute for the alarm to clear.
In RiOS 7.0 and later, RiOS preemptively closes MAPI sessions to reduce the connection count in an attempt to bring the appliance out of admission control by bringing the connection count below the 85 percent threshold. RiOS closes the MAPI sessions in the following order:
•  MAPI prepopulation connections
•  MAPI sessions with the largest number of connections
•  MAPI sessions with most idle connections
•  Most recently optimized MAPI sessions or oldest MAPI session
•  MAPI sessions exceeding the memory threshold
•  Memory - The appliance has entered admission control due to memory consumption. The appliance is optimizing traffic beyond its rated capability and is unable to handle the amount of traffic passing through the WAN link. During this event, the appliance continues to optimize existing connections, but new connections are passed through without optimization. No other action is necessary; the alarm clears automatically when the traffic has decreased.
•  TCP - The appliance has entered admission control due to high TCP memory use. During this event, the appliance continues to optimize existing connections, but new connections are passed through without optimization. The alarm clears automatically when the TCP memory pressure has decreased.
By default, this alarm is enabled.
Asymmetric Routing
Enables an alarm if asymmetric routing is detected on the network. This is usually due to a failover event of an inner router or VPN.
By default, this alarm is enabled.
Connection Forwarding
Enables an alarm if the system detects a problem with a connection-forwarding neighbor. The connection-forwarding alarms are inclusive of all connection-forwarding neighbors. For example, if an appliance has three neighbors, the alarm triggers if any one of the neighbors are in error. In the same way, the alarm clears only when all three neighbors are no longer in error.
These alarms are events:
•  Ack Timeout
•  Connection Failure
•  Keepalive Timeout
•  Latency Exceeded
•  Lost Connection Error
•  Lost Due To End Of Stream
•  Read Information Timeout
•  Multiple Interface
•  Peer IPv6 Incompatible
•  Single Interface
By default, this alarm is enabled.
CPU Utilization
Enables an alarm and sends an email notification if the average and peak threshold for the CPU utilization is exceeded. When an alarm reaches the rising threshold, it is activated; when it reaches the lowest or reset threshold, it is reset. After an alarm is triggered, it is not triggered again until it has fallen below the reset threshold.
By default, this alarm is enabled, with a rising threshold of 90 percent and a reset threshold of 70 percent.
•  Rising Threshold - Specify the rising threshold. When an alarm reaches the rising threshold, it is activated. The default value is 90 percent.
•  Reset Threshold - Specify the reset threshold. When an alarm reaches the lowest or reset threshold, it is reset. After an alarm is triggered, it is not triggered again until it has fallen below the reset threshold. The default value is 80 percent.
Data Store
•  Corruption - Enables an alarm and sends an email notification if the RiOS data store is corrupt or has become incompatible with the current configuration. To clear the RiOS data store of data, restart the optimization service and click Clear the Data Store.
If the alarm was caused by an unintended change to the configuration, the configuration can be changed to match the old RiOS data store settings again and then a service restart (without clearing) will clear the alarm.
•  Data Store Clean Required - Enables an alarm and sends an email notification if you need to clear the RiOS data store.
•  Encryption Level Mismatch - Enables an alarm and sends an email notification if a data store error, such as an encryption, header, or format error occurs.
•  Synchronization Error - Enables an alarm if RiOS data store synchronization has failed. The RiOS data store synchronization between two SteelHeads has been disrupted and the RiOS data stores are no longer synchronized.
By default, this alarm is enabled.
Disk Full
Enables an alarm if the system partitions (not the RiOS data store) are full or almost full. For example, RiOS monitors the available space on /var that is used to hold logs, statistics, system dumps, TCP dumps, and so on.
By default, this alarm is enabled.
Domain Authentication Alert
Enables an alarm when the system is either unable to communicate with the domain controller, or has detected an SMB signing error, or that delegation has failed. CIFS-signed and Encrypted-MAPI traffic is passed through without optimization.
By default, this alarm is enabled.
Domain Join Error
Enables an alarm if an attempt to join a Windows domain has failed. The number one cause of failing to join a domain is a significant difference in the system time on the Windows domain controller and the appliance. A domain join can also fail when the DNS server returns an invalid IP address for the domain controller.
By default, this alarm is enabled.
Duplex
Enables an alarm when an interface was not configured for duplex negotiation but has negotiated duplex mode.
By default, this alarm is enabled.
Flash Protection Failure
Enables an alarm when USB flash drive has not been backed up because there is not enough available space in the /var filesystem directory.
By default, this alarm is enabled.
SteelFusion Blockstore
Enables an alarm when SteelFusion blockstore device has failed.
By default, this alarm is enabled.
SteelFusion Core
Enables an alarm when the SteelFusion Core device has failed.
By default, this alarm is enabled.
SteelFusion iSCSI
Enables an alarm when iSCSI connection to the target has failed.
By default, this alarm is enabled.
SteelFusion LUN
Enables an alarm when the connection to the LUN has failed.
By default, this alarm is enabled.
SteelFusion Snapshot
Enables an alarm when the connection to one or more of the snapshots storage arrays has failed.
By default, this alarm is enabled.
SteelFusion Uncommitted Data
Enables an alarm when SteelFusion uncommitted data has been detected.
By default, this alarm is enabled.
SteelFusion Blockstore
Enables an alarm if the system encounters any of the following issues with the SteelFusion Edge blockstore:
•  The blockstore is running out of space.
•  The blockstore is out of space.
•  The blockstore is running out of memory.
•  The blockstore could not read data that was already replicated to the DC.
•  The blockstore could not read data that is not yet replicated to the DC.
•  The blockstore fails to start due to disk errors or an incorrect configuration.
•  The Granite Edge software version is incompatible with the blockstore version on disk.
•  The blockstore could not save data to disk due to a media error.
By default, this alarm is enabled.
SteelFusion Core
Enables an alarm if the system encounters any of the following issues with the SteelFusion Core:
•  The Edge device has connected to a Granite Core that does not recognize the Edge device.
•  The Edge does not have an active connection with the Granite Core.
•  The data channel between Granite Core and the Edge is down.
•  The connection between the Granite Core and the Edge has stalled.
By default, this alarm is enabled.
SteelFusion iSCSI
Enables an alarm if the iSCSI module encounters an error.
By default, this alarm is enabled.
SteelFusion LUN
Enables an alarm if a LUN becomes unavailable.
By default, this alarm is enabled.
SteelFusion Snapshot
Enables an alarm if a snapshot fails to be commit to the SAN, or a snapshot has fails to complete due to Windows timing out.
By default, this alarm is enabled.
SteelFusion Uncommitted Data
Enables an alarm if a large amount of data in the block store needs to be committed to SteelFusion Core.
By default, this alarm is enabled.
Hardware
•  Disk Error - Enables an alarm when one or more disks is offline. To see that disk is offline, enter the following CLI command from the system prompt:
show raid diagram
 
By default, this alarm is enabled.
This alarm applies only to the appliance RAID Series 3000, 5000, and 6000.
•  Fan Error - Enables an alarm and sends an email notification if a fan is failing or has failed and needs to be replaced. By default, this alarm is enabled.
•  Flash Error - Enables an alarm when the system detects an error with the flash drive hardware. By default, this alarm is enabled.
•  IPMI - Enables an alarm and sends an email notification if an Intelligent Platform Management Interface (IPMI) event is detected. (Not supported on all appliance models.)
This alarm triggers when there has been a physical security intrusion. These events trigger this alarm:
•  Chassis intrusion (physical opening and closing of the appliance case).
•  Memory errors (correctable or uncorrectable ECC memory errors).
•  Hard drive faults or predictive failures.
•  Power cycle, such as turning the power switch on or off, physically unplugging and replugging the cable, or issuing a power cycle from the power switch controller.
By default, this alarm is enabled.
•  Memory Error - Enables an alarm and sends an email notification if a memory error is detected. For example, when a system memory stick fails.
•  Other Hardware Error - Enables an alarm if a hardware error is detected. These issues trigger the hardware error alarm:
•  The appliance does not have enough disk, memory, CPU cores, or NIC cards to support the current configuration.
•  The appliance is using a memory Dual In-line Memory Module (DIMM), a hard disk, or a NIC that is not qualified by Riverbed.
•  Other hardware issues.
By default, this alarm is enabled.
•  Power Supply - Enables an alarm and sends an email notification if an inserted power supply cord does not have power, as opposed to a power supply slot with no power supply cord inserted. By default, this alarm is enabled.
•  RAID - Enables an alarm and sends an email notification if the system encounters an error with the RAID array (for example, missing drives, pulled drives, drive failures, and drive rebuilds). An audible alarm can also sound. To see if a disk has failed, enter the following CLI command from the system prompt:
show raid diagram
For drive rebuilds, if a drive is removed and then reinserted, the alarm continues to be triggered until the rebuild is complete.
Rebuilding a disk drive can take 4-6 hours.
This alarm applies only to the appliance RAID Series 3000, 5000, and 6000.
By default, this alarm is enabled.
•  SSD Write Cycle Level Exceeded - Enables an alarm if the accumulated SSD write cycles exceed a predefined write cycle 95 percent level on appliance models 7050L and 7050M. If the alarm is triggered, the administrator can swap out the disk before any problems arise.
By default, this alarm is enabled.
Licensing
Enables an alarm and sends an email notification if a license on the SCC is removed, is about to expire, has expired, or is invalid. This alarm triggers if the SCC has no MSPEC license installed for its currently configured model.
•  Appliance Unlicensed - This alarm triggers if the SCC has no BASE or MSPEC license installed for its currently configured model.
•  Autolicense critical event - This alarm triggers if the SCC autolicense has a critical event.
•  Autolicense information event - This alarm triggers if the SCC autolicense has event information.
•  Licenses Expired - This alarm triggers if one or more features has at least one license installed, but all of them are expired.
•  Licenses Expiring - This alarm triggers if the license for one or more features is going to expire within two weeks.
Note: The licenses expiring and licenses expired alarms are triggered per feature. For example: if you install two license keys for a feature, LK1-FOO-xxx (expired) and LK1-FOO-yyy (not expired), the alarms do not trigger, because the feature has one valid license.
By default, this alarm is enabled.
Link State
Enables an alarm and sends an email notification if an Ethernet link is lost due to an unplugged cable or dead switch port. Depending on that link is down, the system can no longer be optimizing and a network outage could occur.
This is often caused by surrounding devices, like routers or switches, interface transitioning. This alarm also accompanies service or system restarts on the SCC.
For WAN/LAN interfaces, the alarm triggers if in-path support is enabled for that WAN/LAN pair.
By default, this alarm is disabled.
Load Balancing Alerts
Enables an alarm with either status:
Load Balance Service - Indicates whether the load-balancing service is properly configured.
Oversubscription Alert - Indicates when the total capacity of the remote SteelHeads is much greater than the total capacity of the local SteelHeads (oversubscription).
For detailed information, see the SteelHead Interceptor User’s Guide.
Local Cluster Alerts
Enables an alarm when the selected local cluster conditions are net:
•  Local SteelHead Interceptor Disconnection Alert - If a local Interceptor is disconnected from the cluster.
•  SteelHead Admission Control Alert - If a local appliance is under admission control.
•  SteelHead Capacity Alert - If a local appliance is near to or has reached capacity.
•  SteelHead Permanent Capacity Adjustment Alert - If capacity reduction has been triggered for a local appliance.
•  Version Incompatibility Alert - If version incompatibility exists among cluster appliances.
For detailed information, see the SteelHead Interceptor User’s Guide.
Memory Paging
Enables an alarm and sends an email notification if memory paging is detected. If 100 pages are swapped every couple of hours, the system is functioning properly. If thousands of pages are swapped every few minutes, contact Riverbed Support at
https://support.riverbed.com.
By default, this alarm is enabled.
Neighbor Incompatibility
Enables an alarm if the system has encountered an error in reaching a SteelHead configured for connection forwarding.
By default, this alarm is enabled.
Network Bypass
Enables an alarm and sends an email notification if the system is in bypass failover mode.
By default, this alarm is enabled.
NFS v2/v4 Alarm
Enables an alarm and sends an email notification if the SCC detects that either NFSv2 or NFSv4 is in use. The SteelHead only supports NFS 3.0 and passes through all other versions.
By default, this alarm is enabled.
Optimization Service
•  Internal Error - Enables an alarm and sends an email notification if the RiOS optimization service encounters a condition that can degrade optimization performance. By default, this alarm is enabled.
•  Service Status - Enables an alarm and sends an email notification if the RiOS optimization service encounters a service condition. By default, this alarm is enabled. The message indicates the reason for the condition.
•  Unexpected Halt - Enables an alarm and sends an email notification if the RiOS optimization service halts due to a serious software error. By default, this alarm is enabled.
Process Dump Creation Error
Enables an alarm and sends an email notification if the system detects an error while trying to create a process dump. This alarm indicates an abnormal condition where RiOS cannot collect the core file after three retries. It can be caused when the /var directory is reaching capacity or other conditions. When the alarm is raised, the directory is blacklisted.
By default, this alarm is enabled.
Proxy File Service
Enables an alarm when there has been a PFS operation or configuration error.
•  Configuration - Indicates that a configuration attempt has failed. If the system detects an configuration failure, attempt the configuration again.
•  Operation - Indicates that a synchronization operation has failed. If the system detects an operation failure, attempt the operation again.
Riverbed Service Platform
(Appears when RSP is installed.) Enables an alarm and sends an email notification for general RSP problems including:
•  RSP General Alarm:
–  No memory for RSP is available.
–  An incompatible RSP image is installed.
–  Virtual machines are enabled but not currently powered on.
–  A watchdog activates for any slot that has a watchdog configured.
•  RSP License Expired Alarm - Enables an alarm if a RSP license has expired.
•  RSP License Expiring Alarm - Enables an alarm if a RSP license is due to expire within seven days.
•  RSP Service Alarm - Enables an alarm when RSP is not running.
By default, this alarm is enabled.
Proxy File Service
Indicates that there has been a PFS operation or configuration error:
Proxy File Service Configuration - Indicates that a configuration attempt has failed. If the system detects a configuration failure, attempt the configuration again.
Proxy File Service Operation - Indicates that a synchronization operation has failed. If the system detects an operation failure, attempt the operation again.
By default, this alarm is enabled.
Riverbed Service Platform
(Appears when RSP is installed.) Enables an alarm and sends an email notification for general RSP problems including:
•  RSP General Alarm
•  No memory for RSP is available.
•  An incompatible RSP image is installed.
•  Virtual machines are enabled but not currently powered on.
•  A watchdog activates for any slot that has a watchdog configured.
•  RSP License Expiring - Enables an alarm if a RSP license is due to expire within seven days.
•  RSP License Expired - Enables an alarm if a RSP license has expired.
•  RSP Service Alarm - Enables an alarm when RSP is not running.
By default, this alarm is enabled.
Secure Vault
Enables an alarm and sends an email notification if the system encounters a problem with the secure vault:
•  Secure Vault Locked - Indicates that the secure vault is locked. To optimize SSL connections or to use RiOS data store encryption, the secure vault must be unlocked.
•  Secure Vault New Password Recommended - Indicates that the secure vault requires a new, nondefault password. Re-enter the password.
•  Secure Vault Not Initialized - Indicates that an error has occurred while initializing the secure vault. When the vault is locked, SSL traffic is not optimized and you cannot encrypt the RiOS data store.
Software Version Mismatch
Enables an alarm if there is a mismatch between software versions in the Riverbed system.
By default, this alarm is enabled.
SSL
Enables an alarm if an error is detected in your SSL configuration.
•  Non-443 SSL Servers - Indicates that during a RiOS upgrade (for example, from 5.5 to 6.0), the system has detected a preexisting SSL server certificate configuration on a port other than the default SSL port 443. SSL traffic cannot be optimized. To restore SSL optimization, you can add an in-path rule to the client-side SCC to intercept the connection and optimize the SSL traffic on the nondefault SSL server port.
After adding an in-path rule, you must clear this alarm manually by entering the following CLI command:
stats alarm non_443_ssl_servers_detected_on_upgrade clear
 
•  SSL Certificates Error (SSL CAs) - Indicates that an SSL peering certificate has failed to re-enroll automatically within the Simple Certificate Enrollment Protocol (SCEP) polling interval.
•  SSL Certificates Error (SSL Peering CAs) - Indicates that an SSL peering certificate has failed to reenroll automatically within the Simple Certificate Enrollment Protocol (SCEP) polling interval.
•  SSL Certificates Expiring - Indicates that an SSL certificate is about to expire.
•  SSL Certificates SCEP - Indicates that an SSL certificate has failed to reenroll automatically within the SCEP polling interval.
By default, this alarm is enabled.
Storage Profile Switch Failed
Enables an alarm when an error occurs while repartitioning the disk drives during a storage profile switch. A profile switch changes the disk space allocation on the drives, clears the SteelFusion and VSP data stores, and repartitions the data stores to the appropriate sizes.
By default, this alarm is enabled.
System Detail Report
Enables an alarm if a system component has encountered a problem.
By default, this alarm is disabled (RiOS 7.0.3 and later).
Temperature
•  Critical Temperature - Enables an alarm and sends an email notification if the CPU temperature exceeds the rising threshold. When the CPU returns to the reset threshold, the critical alarm is cleared. The default value for the rising threshold temperature is 70º C; the default reset threshold temperature is 67º C.
•  Warning Temperature - Enables an alarm and sends an email notification if the CPU temperature approaches the rising threshold. When the CPU returns to the reset threshold, the warning alarm is cleared.
•  Rising Threshold - Specifies the rising threshold. The alarm activates when the temperature exceeds the rising threshold. The default value is 70 percent.
•  Reset Threshold - Specifies the reset threshold. The alarm clears when the temperature falls below the reset threshold. The default value is 67 percent.
After the alarm triggers, it cannot trigger again until after the temperature falls below the reset threshold and then exceeds the rising threshold again.
Virtual Services Platform
Enables an alarm and sends an email notification when any child alarm activates for general VSP problems including:
•  ESXi Communication Failed - Enables an alarm that triggers if RiOS cannot communicate with ESXI or the ESXi password is not synchronized with RiOS. This alarm is enabled by default. The polling interval is 10 seconds.
•  ESXi Disk Creation Failed - Enables an alarm that triggers if the ESXi disk creation fails during the VSP setup. This is a critical alarm that is enabled by default. The polling interval is 10 seconds.
•  ESXi Initial Config Failed - Enables an alarm that triggers if the ESXi initial configuration fails. This is a critical alarm that is enabled by default.
•  ESXi License - Enables an alarm and sends an email notification if the ESXi license is removed, is about to expire, has expired, or is a trial version.
–  ESXi License Expired - Enables an alarm when the ESXi license has expired.
–  ESXi License Expiring - Enables an alarm when the ESXi license is going to expire within two weeks.
–  ESXi Using Trial License - Enables an alarm when ESXi is using a trial license.
•  ESXi Memory Overcommitted - Enables an alarm that triggers if the total memory assigned to powered on VMs is more than the total memory available to ESXi for the VMs. This alarm is enabled by default. The polling interval is 30 minutes.
•  ESXi Not Set Up - Enables an alarm that triggers if this is a freshly installed appliance and ESXi has not yet been set up. Complete the initial configuration wizard to enable VSP for the first time. The alarm clears after ESXi installation begins. This alarm is enabled by default. The polling interval is 10 seconds.
•  ESXi Version Unsupported - Enables an alarm that triggers if the appliance does not support the running ESXi version. This alarm is enabled by default. The polling interval is 10 seconds.
•  ESXi vSwitch MTU larger than 1500 - Enables an alarm that triggers if a vSwitch with an uplink or a vmknic interface is configured with the maximum transmission unit (MTU) larger than 1500. Jumbo frames larger than 1500 are not supported. This alarm is enabled by default. The polling interval is 10 seconds.
•  Virtual CPU Utilization - Enables an alarm that triggers if the CPU utilization for the virtualization cores has exceeded an acceptable threshold. CPU utilization is sampled only for the physical CPU cores or cores available for virtualization, not for the CPU cores used by RiOS. By default, this alarm is disabled. The polling interval is 15 seconds.
•  VSP General Alarm - Enables an alarm when VSP has encountered a problem.
•  VSP Service Alarm - Enables an alarm when VSP service has encountered a problem.
•  VSP Service Not Running - Enables an alarm when the virtualization service is not running. The email notification indicates whether the process stopped because the VSP service was disabled, restarted, or crashed. This is a critical alarm that is enabled by default. The polling interval is 10 seconds.
•  VSP Unsupported VM Count - Enables an alarm when the number of virtual machines powered on exceeds 5. This alarm is enabled by default. The polling interval is 30 minutes.
 
•  Rising Threshold - Specify the rising threshold (º C). When an alarm reaches the rising threshold, it is activated. The default value is 70º.
 
•  Reset Threshold - Specify the reset threshold (º C). When an alarm reaches the lowest or reset threshold, it is reset. After an alarm is triggered, it is not triggered again until it has fallen below the reset threshold. The default value is 67º.
Virtual Services Platform
Enables an alarm and sends an email notification when any child alarm activates for general VSP problems including:
•  VSP General Alarm - Enables an alarm when the virtualization service general alarm is not running. This is a critical alarm that is enabled by default.
•  VSP Service Alarm - Enables an alarm when virtualization service alarm is not running. This alarm is enabled by default.