Reference: SteelHead MIB : Accessing the SteelHead enterprise MIB
  
Accessing the SteelHead enterprise MIB
The SteelHead enterprise MIB monitors device status and peers. It provides network statistics for seamless integration into network management systems such as Hewlett Packard OpenView Network Node Manager, PRTG, and other SNMP browser tools.
For details on configuring and using these network monitoring tools, consult their product documentation.
The following guidelines describe how to download and access the SteelHead enterprise MIB using common MIB browsing utilities:
You can download the SteelHead enterprise MIB file (STEELHEAD-MIB.txt) from the Help page of the Management Console or from the Riverbed Support site at https://support.riverbed.com and load it into any MIB browser utility.
Some utilities might expect a file type other than a text file. If this occurs, change the file extension to the type required by the utility you have chosen.
Some utilities assume that the root is mib-2 by default. If the utility sees a new node, such as enterprises, it might look under mib-2.enterprises. If this occurs, use .iso.org.dod.internet.private.enterprises.rbt as the root.
Some command-line browsers might not load all MIB files by default. If this occurs, find the appropriate command option to load the STEELHEAD-MIB.txt file: for example, for NET-SNMP browsers, snmpwalk -m all.
Retrieving optimized traffic statistics by port
When you perform an snmpwalk on the SteelHead MIB object bwPortTable to display a table of statistics for optimized traffic by port, the command retrieves only the monitored ports. The monitored ports include the default TCP ports and any ports you add. To view the monitored ports that this object returns, choose System Settings > Monitored Ports or enter the following CLI command at the system prompt:
show stats settings bandwidth ports
To retrieve statistics for an individual port, perform an snmpget for that port, as in the following example:
.iso.org.dod.internet.private.enterprises.rbt.products.steelhead.statistics.bandwidth.
bandwidthPerPort.bwPort Table.bwPortEntry.bwPortOutLan.port_number
SNMP traps
Every SteelHead supports SNMP traps and email alerts for conditions that require attention or intervention. An alarm triggers for most, but not every, event, and the related trap is sent. For most events, when the condition clears, the system clears the alarm and also sends a clear trap. The clear traps are useful in determining when an event has been resolved.
This section describes the SNMP traps. It doesn’t list the corresponding clear traps.
You can view SteelHead health at the top of each Management Console page, by entering the CLI show info command, and through SNMP (health, systemHealth).
The SteelHead tracks key hardware and software metrics and alerts you of any potential problems so that you can quickly discover and diagnose issues. The health of an appliance falls into one of the following states:
Healthy—The SteelHead is functioning and optimizing traffic.
Needs Attention—Accompanies a healthy state to indicate management-related issues not affecting the ability of the SteelHead to optimize traffic.
Degraded—The SteelHead is optimizing traffic but the system has detected an issue.
Admission Control—The SteelHead is optimizing traffic but has reached its connection limit.
Critical—The SteelHead might or might not be optimizing traffic; you must address a critical issue.
This table summarizes the SNMP traps sent from the system to configured trap receivers and their effect on the SteelHead health state.
Trap and OID
SteelHead state
Text
Description
procCrash
(enterprises.17163.1.1.4.0.1)
Healthy
A procCrash trap signifies that a process managed by PM has crashed and left a core file. The variable sent with the notification indicates which process crashed.
A process has crashed and subsequently been restarted by the system. The trap contains the name of the process that crashed. A system snapshot associated with this crash has been created on the appliance and is accessible via the CLI or the Management Console. Support might need this information to determine the cause of the crash. No other action is required on the appliance as the crashed process is automatically restarted.
procExit
(enterprises.17163.1.1.4.0.2)
Healthy
A procExit trap signifies that a process managed by PM has exited unexpectedly, but not left a core file. The variable sent with the notification indicates which process exited.
A process has unexpectedly exited and been restarted by the system. The trap contains the name of the process. The process might have exited automatically or due to other process failures on the appliance. Review the release notes for known issues related to this process exit. If none exist, contact Support to determine the cause of this event. No other action is required on the appliance as the crashed process is automatically restarted.
cpuUtil
(enterprises.17163.1.1.4.0.3)
Degraded
The average CPU utilization in the past minute has gone above the acceptable threshold.
Average CPU utilization has exceeded an acceptable threshold. If CPU utilization spikes are frequent, it might be because the system is undersized. Sustained CPU load can be symptomatic of more serious issues. Consult the CPU Utilization report to gauge how long the system has been loaded and also monitor the amount of traffic currently going through the appliance. A one-time spike in CPU is normal but we recommend reporting extended high CPU utilization to Support. No other action is necessary as the alarm automatically clears.
pagingActivity
(enterprises.17163.1.1.4.0.4)
 
Degraded
The system has been paging excessively (thrashing).
The system is running low on memory and has begun swapping memory pages to disk. This event can be triggered during a software upgrade while the optimization service is still running but there can be other causes. If this event triggers at any other time, generate a debug sysdump and send it to Support. No other action is required as the alarm automatically clears.
smartError
(enterprises.17163.1.1.4.0.5)
 
N/A
This alarm is deprecated.
N/A
peerVersionMismatch
(enterprises.17163.1.1.4.0.6)
Degraded
Detected a peer with a mismatched software version.
The appliance has encountered another appliance which is running an incompatible version of system software. Refer to the CLI, Management Console, or the SNMP peer table to determine which appliance is causing the conflict. Connections with that peer will not be optimized, connections with other peers running compatible RiOS versions are unaffected. To resolve the problem, upgrade your system software. No other action is required as the alarm automatically clears.
bypassMode
(enterprises.17163.1.1.4.0.7)
Critical
The appliance has entered bypass (failthru) mode.
The appliance has entered bypass mode and is now passing through all traffic unoptimized. This error is generated if the optimization service locks up or crashes. It can also be generated when the system is first powered on or powered off. If this trap is generated on a system that was previously optimizing and is still running, contact Support.
raidError
(enterprises.17163.1.1.4.0.8)
Depre-cated
An error has been generated by the RAID array.
A drive has failed in a RAID array. Consult the CLI or Management Console to determine the location of the failed drive. Contact Support for assistance with installing a new drive, a RAID rebuild, or drive reseating. The appliance continues to optimize during this event. After the error is corrected, the alarm automatically clears.
Applicable to models 3010, 3510, 3020, 3520, 5010, 5520, 6020, and 6120 only.
storeCorruption
(enterprises.17163.1.1.4.0.9)
Critical
The data store is corrupted.
Indicates that the RiOS data store is corrupt or has become incompatible with the current configuration. To clear the RiOS data store of data, choose Administration > Maintenance: Services, select Clear Data Store, and click Restart to restart the optimization service.
If the alarm was triggered by an unintended change to the configuration, change the configuration to match the previous RiOS data store settings. Then restart the optimization service without clearing the data store to reset the alarm.
Typical configuration changes that require an optimization restart with a clear RiOS data store are enabling enhanced peering or changing the data store encryption.
admissionMemError
(enterprises.17163.1.1.4.0.10)
Admission Control
Admission control memory alarm has been triggered.
The appliance has entered admission control due to memory consumption. The appliance is optimizing traffic beyond its rated capability and is unable to handle the amount of traffic passing through the WAN link. During this event, the appliance continues to optimize existing connections, but new connections are passed through without optimization. No other action is necessary as the alarm automatically clears when the traffic has decreased.
admissionConnError
(enterprises.17163.1.1.4.0.11)
Admission Control
Admission control connections alarm has been triggered.
The appliance has entered admission control due to the number of connections and is unable to handle the amount of connections going over the WAN link. During this event, the appliance continues to optimize existing connections, but new connections are passed through without optimization. No other action is necessary as the alarm automatically clears when the traffic has decreased.
haltError
(enterprises.17163.1.1.4.0.12)
Critical
The service is halted due to a software error.
The optimization service has halted due to a serious software error. See if a core dump or a system dump was created. If so, retrieve and contact Support immediately.
serviceError
(enterprises.17163.1.1.4.0.13)
Degraded
There has been a service error. Consult the log file.
The optimization service has encountered a condition which might degrade optimization performance. Consult the system log for more information. No other action is necessary.
scheduledJobError
(enterprises.17163.1.1.4.0.14)
Healthy
A scheduled job has failed during execution.
A scheduled job on the system (for example, a software upgrade) has failed. To determine which job failed, use the CLI or the Management Console.
confModeEnter
(enterprises.17163.1.1.4.0.15)
Healthy
A user has entered configuration mode.
A user on the system has entered a configuration mode from either the CLI or the Management Console. A log in to the Management Console by user admin sends this trap as well. This is for notification purposes only; no other action is necessary.
confModeExit
(enterprises.17163.1.1.4.0.16)
Healthy
A user has exited configuration mode.
A user on the system has exited configuration mode from either the CLI or the Management Console. A log out of the Management Console by user admin sends this trap as well. This is for notification purposes only; no other action is necessary.
linkError
(enterprises.17163.1.1.4.0.17)
Degraded
An interface on the appliance has lost its link.
The system has lost one of its Ethernet links, typically due to an unplugged cable or dead switch port. Check the physical connectivity between the SteelHead and its neighbor device. Investigate this alarm as soon as possible. Depending on what link is down, the system might no longer be optimizing and a network outage could occur.
This is often caused by surrounding devices, like routers or switches interface transitioning. This alarm also accompanies service or system restarts on the SteelHead.
nfsV2V4
(enterprises.17163.1.1.4.0.18)
Degraded
NFS v2/v4 alarm notification.
The SteelHead has detected that either NFSv2 or NFSv4 is in use. The SteelHead only supports NFSv3 and passes through all other versions. Check that the clients and servers are using NFSv3 and reconfigure if necessary.
powerSupplyError
(enterprises.17163.1.1.4.0.19)
Degraded
A power supply on the appliance has failed (not supported on all models).
A redundant power supply on the appliance has failed on the appliance and needs to be replaced. Contact Support for an RMA replacement as soon as practically possible.
asymRouteError
(enterprises.17163.1.1.4.0.20)
Needs Attention
Asymmetric routes detected, certain connections might not be optimized because of this.
Asymmetric routing has been detected on the network. This is likely due to a failover event of an inner router or VPN. If so, no action needs to be taken. If not, contact Support for further troubleshooting assistance.
fanError
(enterprises.17163.1.1.4.0.21)
Degraded
A fan has failed on this appliance (not supported on all models).
A fan is failing or has failed and needs to be replaced. Contact Support for an RMA replacement as soon practically possible.
memoryError
(enterprises.17163.1.1.4.0.22)
Degraded
A memory error has been detected on the appliance (not supported on all models).
A memory error has been detected. A system memory stick might be failing. Try reseating the memory first. If the problem persists, contact Support for an RMA replacement as soon as practically possible.
ipmi
(enterprises.17163.1.1.4.0.23)
Degraded
An IPMI event has been detected on the appliance. Check the details in the alarm report on the Web UI (not supported on all models).
An Intelligent Platform Management Interface (IPMI) event has been detected. Check the Alarm Status page for more detail. You can also view the IPMI events on the SteelHead, by entering the CLI command:
show hardware error-log all
configChange
(enterprises.17163.1.1.4.0.24)
Healthy
A change has been made to the system configuration.
A configuration change has been detected. Check the log files around the time of this trap to determine what changes were made and whether they were authorized.
datastoreWrapped
(enterprises.17163.1.1.4.0.25)
Healthy
The datastore has wrapped around.
The RiOS data store on the SteelHead went through an entire cycle and is removing data to make space for new data. This is normal behavior unless it wraps too quickly, which might indicate that the RiOS data store is undersized. If a message is received every seven days or less, investigate traffic patterns and RiOS data store sizing.
temperatureWarning
(enterprises.17163.1.1.4.0.26)
Degraded
The system temperature has exceeded the threshold.
The appliance temperature is a configurable notification. By default, this notification is set to trigger when the appliance reached 70 degrees Celsius. Raise the alarm trigger temperature if it is normal for the SteelHead to get that hot, or reduce the temperature of the SteelHead.
temperatureCritical
(enterprises.17163.1.1.4.0.27)
Critical
The system temperature has reached a critical stage.
This trap/alarm triggers a critical state on the appliance. This alarm occurs when the appliance temperature reaches 90 degrees Celsius. The temperature value isn’t user-configurable. Reduce the appliance temperature.
cfConnFailure
(enterprises.17163.1.1.4.0.28)
Degraded
Unable to establish connection with the specified neighbor.
The connection can’t be established with a connection-forwarding neighbor. This alarm automatically clears the next time all neighbors connect successfully.
cfConnLostEos
(enterprises.17163.1.1.4.0.29)
Degraded
Connection lost since end of stream was received from the specified neighbor.
The connection has been closed by the connection-forwarding neighbor. This alarm automatically clears the next time all neighbors connect successfully.
cfConnLostErr
(enterprises.17163.1.1.4.0.30)
Degraded
Connection lost due to an error communicating with the specified neighbor.
The connection has been lost with the connection-forwarding neighbor due to an error. This alarm automatically clears the next time all neighbors connect successfully.
cfKeepaliveTimeout
(enterprises.17163.1.1.4.0.31)
Degraded
Connection lost due to lack of keep-alives from the specified neighbor.
The connection-forwarding neighbor has not responded to a keepalive message within the time-out period, indicating that the connection has been lost. This alarm automatically clears when all neighbors of the SteelHead are responding to keepalive messages within the time-out period.
cfAckTimeout
(enterprises.17163.1.1.4.0.32)
Degraded
Connection lost due to lack of ACKs from the specified neighbor.
The connection has been lost because requests have not been acknowledged by a connection-forwarding neighbor within the set time-out threshold. This alarm automatically clears the next time all neighbors receive an ACK from this neighbor and the latency of that acknowledgment is less than the set time-out threshold.
cfReadInfoTimeout
(enterprises.17163.1.1.4.0.33)
Degraded
Timeout reading info from the specified neighbor.
The SteelHead has timed out while waiting for an initialization message from the connection-forwarding neighbor. This alarm automatically clears when the SteelHead is able to read the initialization message from all of its neighbors.
cfLatencyExceeded
(enterprises.17163.1.1.4.0.34)
Degraded
Connection forwarding latency with the specified neighbor has exceeded the threshold.
The amount of latency between connection-forwarding neighbors has exceeded the specified threshold. The alarm automatically clears when the latency falls below the specified threshold.
sslPeeringSCEPAutoReenrollError
(enterprises.17163.1.1.4.0.35)
Needs Attention
There is an error in the automatic re-enrollment of the SSL peering certificate.
An SSL peering certificate has failed to re-enroll with the Simple Certificate Enrollment Protocol (SCEP).
crlError
(enterprises.17163.1.1.4.0.36)
Needs Attention
CRL polling fails.
The polling for SSL peering CAs has failed to update the Certificate Revocation List (CRL) within the specified polling period. This alarm automatically clears when the CRL is updated.
datastoreSyncFailure
(enterprises.17163.1.1.4.0.37)
Degraded
Data store sync has failed.
The RiOS data store synchronization between two SteelHeads has been disrupted and the RiOS data stores are no longer synchronized.
secureVaultNeedsUnlock
(enterprises.17163.1.1.4.0.38)
Needs Attention
SSL acceleration and the secure data store can’t be used until the secure vault has been unlocked.
The secure vault is locked. SSL traffic isn’t being optimized and the RiOS data store can’t be encrypted. Check the Alarm Status page for more details. The alarm clears when the secure vault is unlocked.
secureVaultNeedsRekey
(enterprises.17163.1.1.4.0.39)
Needs Attention
If you wish to use a nondefault password for the secure vault, the password must be rekeyed.
The secure vault password needs to be verified or reset. Initially, the secure vault has a default password known only to the RiOS software so the SteelHead can automatically unlock the vault during system startup.
For details, check the Alarm Status page.
The alarm clears when you verify the default password or reset the password.
secureVaultInitError
(enterprises.17163.1.1.4.0.40)
Critical
An error was detected while initializing the secure vault. Contact Support.
An error occurred while initializing the secure vault after a RiOS software version upgrade. Contact Support.
configSave
(enterprises.17163.1.1.4.0.41)
Healthy
The current appliance configuration has been saved.
A configuration has been saved either by entering the write memory CLI command or by clicking Save to Disk in the Management Console. This message is for security notification purposes only; no other action is necessary.
tcpDumpStarted
(enterprises.17163.1.1.4.0.42)
Healthy
A TCP dump has been started.
A user has started a TCP dump on the SteelHead by entering a tcpdump or tcpdump-x command from the CLI. This message is for security notification purposes only; no other action is necessary.
tcpDumpScheduled
(enterprises.17163.1.1.4.0.43)
Healthy
A TCP dump has been scheduled.
A user has started a TCP dump on the SteelHead by entering a tcpdump or tcpdump-x command with a scheduled start time from the CLI. This message is for security notification purposes only; no other action is necessary.
newUserCreated
(enterprises.17163.1.1.4.0.44)
Healthy
A new user has been created.
A new role-based management user has been created using the CLI or the Management Console. This message is for security notification purposes only; no other action is necessary.
diskError
(enterprises.17163.1.1.4.0.45)
Degraded
Disk error has been detected.
A disk error has been detected. A disk might be failing. Try reseating the memory first. If the problem persists, contact Support.
wearWarning
(enterprises.17163.1.1.4.0.46)
Degraded
Accumulated SSD write cycles passed predefined level.
Triggers on SteelHead models using Solid State Disks (SSDs).
An SSD has reached 95 percent of its write cycle limit. Contact Support.
cliUserLogin
(enterprises.17163.1.1.4.0.47)
Healthy
A user has just logged-in via CLI.
A user has logged in to the SteelHead using the command-line interface. This message is for security notification purposes only; no other action is necessary.
cliUserLogout
(enterprises.17163.1.1.4.0.48)
Healthy
A CLI user has just logged-out.
A user has logged out of the SteelHead using the command-line interface using the Quit command or ^D. This message is for security notification purposes only; no other action is necessary.
webUserLogin
(enterprises.17163.1.1.4.0.49)
Healthy
A user has just logged-in via the Web UI.
A user has logged in to the SteelHead using the Management Console. This message is for security notification purposes only; no other action is necessary.
webUserLogout
(enterprises.17163.1.1.4.0.50)
Healthy
A user has just logged-out via the Web UI.
A user has logged out of the SteelHead using the Management Console. This message is for security notification purposes only; no other action is necessary.
trapTest
(enterprises.17163.1.1.4.0.51)
Healthy
Trap Test
An SNMP trap test has occurred on the SteelHead. This message is informational and no action is necessary.
admissionCpuError
(enterprises.17163.1.1.4.0.52)
 
Admission Control
Optimization service is experiencing high CPU utilization.
The appliance has entered admission control due to high CPU use. During this event, the appliance continues to optimize existing connections, but new connections are passed through without optimization. No other action is necessary as the alarm automatically clears when the CPU usage has decreased.
admissionTcpError
(enterprises.17163.1.1.4.0.53)
Admission Control
Optimization service is experiencing high TCP memory pressure.
The appliance has entered admission control due to high TCP memory use. During this event, the appliance continues to optimize existing connections, but new connections are passed through without optimization. No other action is necessary as the alarm automatically clears when the TCP memory pressure has decreased.
systemDiskFullError
(enterprises.17163.1.1.4.0.54)
Degraded
One or more system partitions is full or almost full.
The alarm clears when the system partitions fall below usage thresholds.
domainJoinError
(enterprises.17163.1.1.4.0.55)
Degraded
An attempt to join a domain failed.
An attempt to join a Windows domain has failed.
The number one cause of failing to join a domain is a significant difference in the system time on the Windows domain controller and the SteelHead. When the time on the domain controller and the SteelHead don’t match, this error message appears:
lt-kinit: krb5_get_init_creds: Clock skew too great
We recommend using NTP time synchronization to synchronize the client and server clocks. It is critical that the SteelHead time is the same as the time on the Active Directory controller. Sometimes an NTP server is down or inaccessible, in which case there can be a time difference. You can also disable NTP if it isn’t being used and manually set the time. You must also verify that the time zone is correct.
A domain join can fail when the DNS server returns an invalid IP address for the domain controller. When a DNS misconfiguration occurs during an attempt to join a domain, these error messages appear:
Failed to join domain: failed to find DC for domain <domain name>
Failed to join domain : No Logon Servers
Additionally, the domain join alarm triggers and messages similar to the following appear in the logs:
Oct 13 14:47:06 bravo-sh81 rcud[10014]: [rcud/main/.ERR] - {- -} Failed to join domain: failed to find DC for domain GEN-VCS78DOM.COM
When you encounter this error, go to the Networking > Networking: Host Settings page and verify that the DNS settings are correct.
To verify the time settings, go to the Administration > System Settings: Date/Time page.
certsExpiringError
(enterprises.17163.1.1.4.0.56)
Needs Attention
Some x509 certificates may be expiring.
The service has detected some x.509 certificates used for Network Administration Access to the SteelHead that are close to their expiration dates. The alarm clears when the x.509 certificates are updated.
licenseError
(enterprises.17163.1.1.4.0.57)
Critical
The main SteelHead license has expired, been removed, or become invalid.
A license on the SteelHead has been removed, has expired, or is invalid. The alarm clears when a valid license is added or updated.
hardwareError
(enterprises.17163.1.1.4.0.58)
Either Critical or Degraded, depending on the state
Hardware error detected.
Indicates that the system has detected a problem with the SteelHead hardware. These issues trigger the hardware error alarm:
the SteelHead doesn’t have enough disk, memory, CPU cores, or NIC cards to support the current configuration
the SteelHead is using a memory Dual In-line Memory Module (DIMM), a hard disk, or a NIC that isn’t qualified by Riverbed
other hardware issues
The alarm clears when you add the necessary hardware, remove the unqualified hardware, or resolve other hardware issues.
sysdetailError
(enterprises.17163.1.1.4.0.59)
Needs Attention
Error is found in System Detail Report.
A top-level module on the system detail report is in error. For details, choose Reports > Diagnostics: System Details.
admissionMapiError
(enterprises.17163.1.1.4.0.60)
Degraded
New MAPI connections will be passed through due to high connection count.
The total number of MAPI optimized connections have exceeded the maximum admission control threshold. By default, the maximum admission control threshold is 85 percent of the total maximum optimized connection count for the client-side SteelHead. The SteelHead reserves the remaining 15 percent so the MAPI admission control doesn’t affect the other protocols. The 85 percent threshold is applied only to MAPI connections.
RiOS is now passing through MAPI connections from new clients but continues to intercept and optimize MAPI connections from existing clients (including new MAPI connections from these clients).
RiOS continues optimizing non-MAPI connections from all clients.
This alarm is disabled by default.
The alarm automatically clears when the MAPI traffic has decreased; however, it can take one minute for the alarm to clear.
RiOS pre-emptively closes MAPI sessions to reduce the connection count in an attempt to bring the SteelHead out of admission control by bringing the connection count below the 85 percent threshold. RiOS closes the MAPI sessions in this order:
MAPI prepopulation connections
MAPI sessions with the largest number of connections
MAPI sessions with most idle connections
The oldest MAPI session
MAPI sessions exceeding the memory threshold
MAPI admission control can’t solve a general SteelHead Admission Control Error (enterprises.17163.1.1.4.0.11); however, it can help to prevent it from occurring.
neighborIncompatibility
(enterprises.17163.1.1.4.0.61)
Degraded
Serial cascade misconfiguration has been detected.
Check your automatic peering configuration. Restart the optimization service to clear the alarm.
flashError
(enterprises.17163.1.1.4.0.62)
Needs Attention
Flash hardware error detected.
At times, the USB flash drive that holds the system images might become unresponsive; the SteelHead continues to function normally. When this alarm triggers, you can’t perform a software upgrade, as the system is unable to write a new upgrade image to the flash drive without first power cycling the system.
To reboot the appliance, go to the Administration > Maintenance: Reboot/Shutdown page or enter the CLI reload command to automatically power cycle the SteelHead and restore the flash drive to its proper function.
On desktop SteelHead x50 and x55 models, you must physically power cycle the appliance (push the power button or pull the power cord).
lanWanLoopError
(enterprises.17163.1.1.4.0.63)
Critical
LAN-WAN loop detected. System will not optimize new connections until this error is cleared.
A LAN-WAN network loop has been detected between the LAN and WAN interfaces on a SteelHead (Virtual Edition). This can occur when you connect the LAN and WAN virtual NICs to the same vSwitch or physical NIC. This alarm triggers when a SteelHead (Virtual Edition) starts up, and clears after you connect each LAN and WAN virtual interface to a distinct virtual switch and physical NIC (through the vSphere Networking tab) and then reboot the SteelHead (Virtual Edition).
optimizationServiceStatusError
(enterprises.17163.1.1.4.0.64)
Critical
Optimization service currently not optimizing any connections.
The optimization service has encountered an optimization service condition. The message indicates the reason for the condition:
optimization service isn’t running
This message appears after a configuration file error. For more information, review the SteelHead logs.
in-path optimization isn’t enabled
This message appears if an in-path setting is disabled for an in-path SteelHead. For more information, review the SteelHead logs.
optimization service is initializing
This message appears after a reboot. The alarm clears on its own; no other action is necessary. For more information, review the SteelHead logs.
optimization service isn’t optimizing
This message appears after a system crash. For more information, review the SteelHead logs.
optimization service is disabled by user
This message appears after entering the CLI command no service enable or shutting down the optimization service from the Management Console. For more information, review the SteelHead logs.
optimization service is restarted by user
This message appears after the optimization service is restarted from either the CLI or Management Console. You might want to review the SteelHead logs for more information.
upgradeFailure
(enterprises.17163.1.1.4.0.65)
Needs attention
Upgrade failed and the system is running the previous image.
A RiOS upgrade has failed and the SteelHead is running the previous RiOS version. Check the banner message in the Management Console to view more information. The banner message displays which upgrade failed along with the RiOS version the SteelHead has reverted to and is currently running.
Check that the upgrade image is correct for your SteelHead.
Verify that the upgrade image isn’t corrupt. You can use the MD5 checksum tool provided on the Riverbed Support site for the verification.
After you have confirmed that the image isn’t corrupt, upgrade the RiOS software again. If the upgrade continues to fail, contact Riverbed Support.
licenseExpiring
(enterprises.17163.1.1.4.0.66)
Needs Attention
One or more licensed features will expire within the next two weeks.
Choose Administration > Maintenance: Licenses and look at the Status column to see which licenses are about to expire. One or more feature licenses are scheduled to expire within two weeks.
This alarm is triggered per feature. Suppose you installed two license keys for a feature, LK1-FOO-xxx, which is going to expire in two weeks, and LK1-FOO-yyy, which isn’t expired. Because one license for the feature is valid, the alarm doesn’t trigger.
licenseExpired
(enterprises.17163.1.1.4.0.67)
Degraded
One or more licensed features have expired.
Choose Administration > Maintenance: Licenses and look at the Status column to see which licenses have expired. One or more feature licenses have expired.
This alarm is triggered per feature. Suppose you installed two license keys for a feature, LK1-FOO-xxx (expired), and LK1-FOO-yyy (not expired). Because one license for the feature is valid, the alarm doesn’t trigger.
clusterDisconnectedSHAlertError
(enterprises.17163.1.1.4.0.68)
Degraded
A cluster SteelHead has been reported as disconnected.
Choose Networking > Network Integration: Connection Forwarding and verify the configuration for both this SteelHead and the neighbor SteelHead. Verify that the neighbor is reachable from this SteelHead.
Next, check that the optimization service is running on both SteelHeads.
This error clears when the configuration is valid.
smbAlert
(enterprises.17163.1.1.4.0.69)
Needs Attention
Domain authentication alert.
The optimization service has detected a failure with domain controller communication or a delegate user.
Confirm that the SteelHead residing in the data center is properly joined to the domain by choosing Networking > Windows Domain.
To view useful debugging information in RiOS 7.0 or later, enter the CLI commands
show protocol domain-auth test join
show alarm smb_alert
Verify that a delegate user has been added to the SteelHead and is configured with the appropriate privileges.
linkDuplex
(enterprises.17163.1.1.4.0.70)
 
Degraded
An interface on the appliance is in half-duplex mode
Indicates that an interface was not configured for half-duplex negotiation but has negotiated half-duplex mode. Half-duplex significantly limits the optimization service results.
Choose Networking > Networking: Base Interfaces and examine the SteelHead link configuration. Next, examine the peer switch user interface to check its link configuration. If the configuration on one side is different from the other, traffic is sent at different rates on each side, causing many collisions.
To troubleshoot, change both interfaces to automatic duplex negotiation. If the interfaces don’t support automatic duplex, configure both ends for full duplex.
linkIoErrors
(enterprises.17163.1.1.4.0.71)
 
Degraded
An interface on the appliance is suffering I/O errors
Indicates that the error rate on an interface has exceeded 0.1 percent while either sending or receiving packets. This threshold is based on the observation that even a small link error rate reduces TCP throughput significantly. A properly configured LAN connection should experience few errors. The alarm clears when the error rate drops below 0.05 percent.
To troubleshoot, try a new cable and a different switch port. Another possible cause is electromagnetic noise nearby.
You can change the default alarm thresholds by entering the alarm link_errors err-threshold xxxxxx CLI command at the system prompt. For details, see the Riverbed Command-Line Interface Reference Manual.
storageProfSwitchFailed
(enterprises.17163.1.1.4.0.73)
Either Critical or Needs Attention, depending on the state
 
Storage profile switch has failed.
An error has occurred while repartitioning the disk drives during a storage profile switch. A profile switch changes the disk space allocation on the drives, clears the SteelFusion and VSP data stores, and repartitions the data stores to the appropriate sizes.
You switch a storage profile by entering the disk-config layout CLI command at the system prompt or by choosing Administration > System Settings: Disk Management on an EX or EX+SteelFusion SteelHead and selecting a storage profile.
These reasons can cause a profile switch to fail:
RiOS can’t validate the profile.
The profile contains an invalid upgrade or downgrade.
RiOS can’t clean up the existing VDMKs. During clean up RiOS uninstalls all slots and deletes all backups and packages.
When you encounter this error, switch the storage profile again. If the switch succeeds, the error clears. If it fails, RiOS reverts the SteelHead to the previous storage profile.
If RiOS is unable to revert the SteelHead to the previous storage profile, the alarm status becomes critical.
If RiOS successfully reverts the SteelHead to the previous storage profile, the alarm status displays needs attention.
clusterIpv6IncompatiblePeerError
(enterprises.17163.1.1.4.0.74)
Degraded
A cluster SteelHead has been reported as IPv6 incompatible.
The optimization service has encountered a peer SteelHead IPv6 incompatibility. The message indicates the reason for the condition:
Not all local inpath interfaces configured for IPv6
This message indicates that the peer SteelHead is IPv6 capable and its IP address configuration is correct, but the IP address configuration on the local SteelHead doesn’t match the configuration on the peer SteelHead. The mismatch means that there’s at least one relay on the local appliance that isn’t IPv4 or IPv6 capable. An IPv4 address is necessary for routing between neighbors and an IPv6 address is necessary for v6 optimization.
Not all peer inpath interfaces configured for IPv6
This message indicates that the local SteelHead is IPv6 capable and its IP address configuration is correct, but the IP address configuration on the peer SteelHead doesn’t match the configuration on the local SteelHead. The mismatch means that there’s at least one relay on the peer that isn’t IPv4 or IPv6 capable. An IPv4 address is necessary for routing between neighbors and an IPv6 address is necessary for v6 optimization.
Cluster IPv6 Incompatible
Indicates that a connection-forwarding neighbor is running a RiOS version that is incompatible with IPv6. Neighbors must be running RiOS 8.5 or later. The SteelHead neighbors pass through IPv6 connections when this alarm triggers.
flashProtectionFailed
(enterprises.17163.1.1.4.0.75)
Critical
Flash disk hasn't been backed up due to not enough free space on /var filesystem.
Indicates that the USB flash drive has not been backed up because there isn’t enough available space in the /var filesystem directory.
Examine the /var directory to see if it is storing an excessive amount of snapshots, system dumps, or TCP dumps that you could delete. You could also delete any RiOS images that you no longer use.
datastoreNeedClean
(enterprises.17163.1.1.4.0.76)
Critical
The data store needs to be cleaned.
You need to clear the RiOS data store. To clear the data store, choose Administration > Maintenance: Services and select the Clear Data Store check box before restarting the appliance.
Clearing the data store degrades performance until the system repopulates the data.
pathSelectionPathDown
(enterprises.17163.1.1.4.0.77)
Degraded
Path Selection -A path has gone down.
Indicates that one of the predefined paths for a connection is unavailable because it has exceeded either the timeout value for path latency or the threshold for observed packet loss.
When a path fails, the SteelHead directs traffic through another available path. When the original path comes back up, the SteelHead redirects the traffic back to it.
clusterNeighborIncompatibleError
(enterprises.17163.1.1.4.0.80)
 
Degraded
At least one node in the cluster is incompatible.
The optimization service has encountered a neighbor incompatibility. The message indicates one of these conditions:
A cluster neighbor is running a RiOS version that doesn’t support the connection between neighbors. Neighbors must be running RiOS 8.6.x or later.
A connection-forwarding neighbor in a SteelHead Interceptor cluster has path selection enabled while path selection isn’t enabled on another appliance in the cluster.
secureTransportControllerUnreachable
(enterprises.17163.1.1.4.0.81)
 
SteelHead cannot connect to Secure Transport controller.
Indicates that a peer SteelHead is no longer connected to the secure transport controller. The controller is a SteelHead that typically resides in the data center and manages the control channel and operations required for secure transport between SteelHead peers. The control channel between the SteelHeads uses SSL to secure the connection between the peer SteelHead and the secure transport controller.
The peer SteelHead is no longer connected to the secure transport controller because:
The connectivity between the peer SteelHead and the secure transport controller is lost.
The SSL for the connection isn’t configured correctly.
secureTransportRegistrationFailed
(enterprises.17163.1.1.4.0.82)
 
SteelHead cannot register with Secure Transport controller.
Indicates that the peer SteelHead isn’t registered with the secure transport controller and the controller doesn’t recognize it as a member of the secure transport group.
pathSelectionPathProbingError
(enterprises.17163.1.1.4.0.83)
Needs Attention
Path Selection - At least one path has probing error.
Indicates that a path selection monitoring probe for a predefined path has received a probe response from an unexpected relay or interface.
webProxyConfigAlarm
(enterprises.17163.1.1.4.0.84)
Degraded
Web Proxy Service Configuration Alarm.
Indicates that there’s a problem with the web proxy service configuration.
webProxyServiceAlarm
(enterprises.17163.1.1.4.0.85)
Degraded
Web Proxy Service Status Alarm.
Indicates that there’s a problem with the web proxy service.
portalUnReachableAlarm
(enterprises.17163.1.1.4.0.86)
Healthy
The Cloud Portal is unreachable from the SteelHead.
Check to see if the Cloud Portal is reachable from the SteelHead. If the portal is not reachable, check the network settings on the SteelHead.