Control | Description |
Admission Control | Enables an alarm and sends an email notification if the appliance enters admission control. When this occurs, the appliance optimizes traffic beyond its rated capability and is unable to handle the amount of traffic passing through the WAN link. During this event, the appliance continues to optimize existing connections, but new connections are passed through without optimization. • Connection Limit - Indicates the system connection limit has been reached. Additional connections are passed through unoptimized. The alarm clears when the appliance moves out of this condition. • CPU - The appliance has entered admission control due to high CPU use. During this event, the appliance continues to optimize existing connections, but new connections are passed through without optimization. The alarm clears automatically when the CPU usage has decreased. • MAPI - The total number of MAPI optimized connections have exceeded the maximum admission control threshold. By default, the maximum admission control threshold is 85 percent of the total maximum optimized connection count for the client-side appliance. The appliance reserves the remaining 15 percent so that the MAPI admission control does not affect the other protocols. The 85 percent threshold is applied only to MAPI connections. RiOS is now passing through MAPI connections from new clients but continues to intercept and optimize MAPI connections from existing clients (including new MAPI connections from these clients). RiOS continues optimizing non-MAPI connections from all clients. The alarm clears automatically when the MAPI traffic has decreased; however, it can take one minute for the alarm to clear. RiOS preemptively closes MAPI sessions to reduce the connection count in an attempt to bring the appliance out of admission control by bringing the connection count below the 85 percent threshold. RiOS closes the MAPI sessions in this order: • MAPI prepopulation connections • MAPI sessions with the largest number of connections • MAPI sessions with most idle connections • Most recently optimized MAPI sessions or oldest MAPI session • MAPI sessions exceeding the memory threshold • Memory - The appliance has entered admission control due to memory consumption. The appliance is optimizing traffic beyond its rated capability and is unable to handle the amount of traffic passing through the WAN link. During this event, the appliance continues to optimize existing connections, but new connections are passed through without optimization. No other action is necessary; the alarm clears automatically when the traffic has decreased. • TCP - The appliance has entered admission control due to high TCP memory use. During this event, the appliance continues to optimize existing connections, but new connections are passed through without optimization. The alarm clears automatically when the TCP memory pressure has decreased. By default, this alarm is enabled. |
Application Consistent Snapshot | Enables an alarm and sends an email notification when an application-consistent snapshot failed to be committed to the Core, or a snapshot failed to complete. Application consistent snapshots are scheduled using the Core snapshot scheduler. A snapshot is application consistent if, in addition to being write-order consistent, it includes data from running applications that complete their operations and flush their buffers to disk. This error triggers when there are problems interacting with servers (ESXi or Windows). The first interaction with servers is to prepare for a snapshot (where the server gets filesystems or a VM in a consistent state), and the second is to resume after the snapshot is taken (the server can clean up, stop logging changes, and so on). Errors can also occur due to misconfigurations on either side, local issues on the servers (high load, timeouts, reboots), networking problems, and so on. By default, this alarm is enabled. |
Asymmetric Routing | Enables an alarm if asymmetric routing is detected on the network. Asymmetric routing is usually due to a failover event of an inner router or VPN. By default, this alarm is enabled. |
Blockstore | Enables an alarm and sends an email notification if the system encounters any of these issues with the Edge blockstore: • The blockstore is running out of space. • The blockstore is out of space. • The blockstore is running out of memory. • The blockstore could not read data that was already replicated to the Core. • The blockstore could not read data that is not yet replicated to the Core. • The blockstore fails to start due to disk errors or an incorrect configuration. • The Edge software version is incompatible with the blockstore version on disk. • The standby Edge software version in a high-availability appliance pair is incompatible with the active Edge software version. • On appliances with read cache solid state disks (SSDs), the read cache fails to start. • The blockstore could not save data to disk due to a media error. By default, this alarm is enabled. |
Connection Forwarding | Enables an alarm if the system detects a problem with a connection-forwarding neighbor. The connection-forwarding alarms are inclusive of all connection-forwarding neighbors. For example, if an appliance has three neighbors, the alarm triggers if any one of the neighbors are in error. In the same way, the alarm clears only when all three neighbors are no longer in error. • Cluster Neighbor Incompatible - Enables an alarm and sends an email notification if a connection-forwarding neighbor is running a RiOS version that is incompatible with IPv6, or if the IP address configuration between neighbors does not match. Neighbors must be running RiOS 8.5 or later. • Multiple Interface - Enables an alarm and sends an email notification if the connection to a SteelHead in a connection forwarding cluster is lost. • Single Interface - Enables an alarm and sends an email notification if the connection to a SteelHead connection-forwarding neighbor is lost. By default, this alarm is enabled. |
CPU Utilization | Enables an alarm and sends an email notification if the average and peak threshold for the CPU utilization is exceeded. When an alarm reaches the rising threshold, it is activated; when it reaches the lowest or reset threshold, it is reset. After an alarm is triggered, it is not triggered again until it has fallen below the reset threshold. By default, this alarm is enabled. • Rising Threshold - Specify the rising threshold. When an alarm reaches the rising threshold, it is activated. The default value is 90 percent. • Reset Threshold - Specify the reset threshold. When an alarm reaches the lowest or reset threshold, it is reset. After an alarm is triggered, it is not triggered again until it has fallen below the reset threshold. The default value is 70 percent. |
Data Store | • Corruption - Enables an alarm and sends an email notification if the RiOS data store is corrupt or has become incompatible with the current configuration. To clear the RiOS data store of data, restart the optimization service and click Clear the Data Store. If the alarm was caused by an unintended change to the configuration, the configuration can be changed to match the old data store settings again and then a service restart (without clearing) will clear the alarm. Typical configuration changes that require a restart clear are changes to the data store encryption (choose Optimization > Data Replication: Data Store) or enabling extended peer table (choose Optimization > Network Services: Peering Rules). • Data Store Clean Required - Enables an alarm and sends an email notification if you need to clear the RiOS data store. • Encryption Level Mismatch - Enables an alarm and sends an email notification if a data store error such as an encryption, header, or format error occurs. • Synchronization Error - Enables an alarm if RiOS data store synchronization has failed. The RiOS data store synchronization between two appliances has been disrupted and the RiOS data stores are no longer synchronized. By default, this alarm is enabled. |
Disk Full | Enables an alarm if the system partitions (not the RiOS data store) are full or almost full. For example, RiOS monitors the available space on /var, which is used to hold logs, statistics, system dumps, TCP dumps, and so on. By default, this alarm is enabled. |
Domain Authentication Alert | Enables an alarm when the system is either unable to communicate with the domain controller, or has detected an SMB signing error, or that delegation has failed. CIFS-signed and Encrypted-MAPI traffic is passed through without optimization. By default, this alarm is enabled. |
Domain Join Error | Enables an alarm if an attempt to join a Windows domain has failed. The number one cause of failing to join a domain is a significant difference in the system time on the Windows domain controller and the appliance. A domain join can also fail when the DNS server returns an invalid IP address for the domain controller. By default, this alarm is enabled. |
Edge HA Service | Enables an alarm and sends an email notification if only one of the appliances in a high availability (HA) SteelFusion Edge pair is actively serving storage data (the active peer). The two appliances maintain a heartbeat protocol between them, so that if the active peer goes down, the standby peer can take over servicing the exports. If the standby peer goes down, the active peer continues servicing the exports after raising this alarm and sending an email that the appliance is degraded. The email contains the IP address of the peer appliance. When the appliance is degraded, after a failed peer resumes, it resynchronizes with the other peer in the HA pair to receive any data that was written since the time of the failure. After the peer receives all the written data, the HA resumes and any future writes are reflected to both peers. By default, this alarm is enabled. |
Hardware | These alarms report issues with the SteelFusion Edge RiOS node hardware. • Disk Error - Enables an alarm when one or more disks is offline. To see which disk is offline, enter the show raid diagram command from the system prompt. By default, this alarm is enabled. This alarm applies only to the SteelHead RAID Series 3000, 5000, and 6000. • Fan Error - Enables an alarm and sends an email notification if a fan is failing or has failed and needs to be replaced. By default, this alarm is enabled. • Flash Error - Enables an alarm when the system detects an error with the flash drive hardware. By default, this alarm is enabled. • IPMI - Indicates an Intelligent Platform Management Interface (IPMI) event. This alarm triggers when there has been a physical security intrusion. These events trigger this alarm: • Chassis intrusion (physical opening and closing of the appliance case) • Memory errors (correctable or uncorrectable ECC memory errors) • Hard drive faults or predictive failures • Power supply status or predictive failure By default, this alarm is enabled. • Management Disk Size Error - Enables an alarm if the size of the management disk is too small to support the virtual appliance model. • Memory Error - Enables an alarm and sends an email notification if a memory error is detected: for example, when a system memory stick fails. • Other Hardware Error - Enables an alarm if a hardware error is detected. These issues trigger the hardware error alarm: • The appliance does not have enough disk, memory, CPU cores, or NICs to support the current configuration. • The appliance is using a memory Dual In-line Memory Module (DIMM), a hard disk, or a NIC that is not qualified by Riverbed. • DIMMs are plugged into the appliance but RiOS cannot recognize them because: – a DIMM is in the wrong slot. You must plug DIMMs into the black slots first and then use the blue slots when all of the black slots are in use. —or— – a DIMM is broken and you must replace it. |
• Safety Valve: disk access exceeds response times - Enables an alarm when the appliance is experiencing increased disk access time and has started the safety valve disk bypass mechanism that switches connections into SDR-A. SDR-A performs data reduction in memory until the disk access latency falls below the safety valve activation threshold. Disk access time can exceed the safety valve activation threshold for several reasons: the appliance might be undersized for the amount of traffic it is required to optimize, a larger than usual amount of traffic is being optimized temporarily, or a disk is experiencing hardware issues such as sector errors, failing mechanicals, or RAID disk rebuilding. You configure the safety valve activation threshold and timeout using CLI commands: datastore safety-valve threshold datastore safety-value timeout For details, see the Riverbed Command-Line Interface Reference Manual. • Other hardware issues By default, this alarm is enabled. • Power Supply - Enables an alarm and sends an email notification if an inserted power supply cord does not have power, as opposed to a power supply slot with no power supply cord inserted. By default, this alarm is enabled. • RAID - Indicates an error with the RAID array (for example, missing drives, pulled drives, drive failures, and drive rebuilds). An audible alarm might also sound. To see if a disk has failed, enter the show raid diagram CLI command from the system prompt. For drive rebuilds, if a drive is removed and then reinserted, the alarm continues to be triggered until the rebuild is complete. Rebuilding a disk drive can take 4 to 6 hours. This alarm applies only to the RAID Series 3000, 5000, and 6000. • SSD Write Cycle Level Exceeded - Enables an alarm if the accumulated SSD write cycles exceed a predefined write cycle 95 percent level on SteelHead models 7050L and 7050M. If the alarm is triggered, the administrator can swap out the disk before any problems arise. By default, this alarm is enabled. | |
Hypervisor Hardware | Enables an alarm when a problem occurs with the SteelFusion Edge Hypervisor node hardware. The hypervisor hardware affects virtualization on the appliance. These issues trigger the hypervisor hardware alarm: • Hardware Management Connection - Enables an alarm and sends an email notification when RiOS loses IP connectivity or cannot authenticate the connection to the hypervisor motherboard controller. • Hardware Management Controller Unauthenticated User - Enables an alarm and sends an email notification when RiOS does not recognize the password used to access the hardware management controller. • Memory - Enables an alarm and sends an email notification if a memory error is detected: for example, when a system memory stick fails. • Other Hardware - Enables an alarm if a hardware error is detected. These issues trigger the hardware error alarm: • The hypervisor hardware is using a memory Dual In-line Memory Module (DIMM), a hard disk, or a NIC that is not qualified. • The hypervisor hardware has detected a RiOS NIC. The hypervisor does not support RiOS NICs. • DIMMs are plugged into the hypervisor hardware but the hypervisor cannot recognize them because: – a DIMM is in the wrong slot. You must plug DIMMs into the black slots first and then use the blue slots when all of the black slots are in use. —or— – a DIMM is broken and you must replace it. • Power - Enables an alarm and sends an email notification if the hypervisor loses power unexpectedly. • Temperature - Enables an alarm and sends an email notification if a hypervisor CPU, board, or platform controller hub (PCH) temperature exceeds the rising threshold. When the CPU, board, or PCH returns to the reset threshold, the critical alarm clears (after polling for 30 seconds). If the appliance has more than one CPU, the alarm displays both CPUs. The default values are maintained by the motherboard. |
Inbound QoS WAN Bandwidth Configuration | Enables an alarm and sends an email notification if the inbound QoS WAN bandwidth for one or more of the interfaces is set incorrectly. You must configure the WAN bandwidth to be less than or equal to the interface bandwidth link rate. This alarm triggers when the system encounters one of these conditions: • An interface is connected and the WAN bandwidth is set higher than its bandwidth link rate: for example, if the bandwidth link rate is 1536 kbps, and the WAN bandwidth is set to 2000 kbps. • A nonzero WAN bandwidth is set and QoS is enabled on an interface that is disconnected; that is, the bandwidth link rate is 0. • A previously disconnected interface is reconnected, and its previously configured WAN bandwidth was set higher than the bandwidth link rate. The Management Console refreshes the alarm message to inform you that the configured WAN bandwidth is set higher than the interface bandwidth link rate. While this alarm appears, the appliance puts existing connections into the default class. The alarm clears when you configure the WAN bandwidth to be less than or equal to the bandwidth link rate or reconnect an interface configured with the correct WAN bandwidth. By default, this alarm is enabled. |
Licensing | Enables an alarm and sends an email notification if a license is removed, is about to expire, has expired, or is invalid. This alarm triggers if the appliance has no MSPEC license installed for its currently configured model. • Appliance Unlicensed - This alarm triggers if the appliance has no BASE or MSPEC license installed for its currently configured model. For details about updating licenses, see Managing licenses and model upgrades. • Autolicense Critical Event - This alarm triggers on a SteelHead (virtual edition) appliance when the Riverbed Licensing Portal cannot respond to a license request with valid licenses. The Licensing Portal cannot issue a valid license for one of these reasons: – A newer SteelHead (virtual edition) appliance is already using the token, so you cannot use it on the SteelHead (virtual edition) appliance displaying the critical alarm. Every time the SteelHead (virtual edition) appliance attempts to refetch a license token, the alarm retriggers. – The token has been redeemed too many times. Every time the SteelHead (virtual edition) appliance attempts to refetch a license token, the alarm retriggers. • Autolicense Informational Event - This alarm triggers if the Riverbed Licensing Portal has information regarding the licenses for a SteelHead (virtual edition) appliance. For example, the SteelHead (virtual edition) appliance displays this alarm when the portal returns licenses that are associated with a token that has been used on a different SteelHead (virtual edition) appliance. • Licenses Expired - This alarm triggers if one or more features has at least one license installed, but all of them are expired. • Licenses Expiring - This alarm triggers if the license for one or more features is going to expire within two weeks. Note: The licenses expiring and licenses expired alarms are triggered per feature. For example: if you install two license keys for a feature, LK1-FOO-xxx (expired) and LK1-FOO-yyy (not expired), the alarms do not trigger, because the feature has one valid license. By default, this alarm is enabled. |
Link Duplex | Enables an alarm and sends an email notification when an interface was not configured for half-duplex negotiation but has negotiated half-duplex mode. Half-duplex significantly limits the optimization service results. The alarm displays which interface is triggering the duplex alarm. By default, this alarm is enabled. You can enable or disable the alarm for a specific interface. To enable or disable an alarm, choose Administration > System Settings: Alarms and select or clear the check box next to the link name. |
Link I/O Errors | Enables an alarm and sends an email notification when the link error rate exceeds 0.1 percent while either sending or receiving packets. This threshold is based on the observation that even a small link error rate reduces TCP throughput significantly. A properly configured LAN connection experiences very few errors. The alarm clears when the rate drops below 0.05 percent. You can change the default alarm thresholds by entering the alarm link_io_errors err-threshold <threshold-value> command at the system prompt. For details, see the Riverbed Command-Line Interface Reference Manual. By default, this alarm is enabled. You can enable or disable the alarm for a specific interface. For example, you can disable the alarm for a link after deciding to tolerate the errors. To enable or disable an alarm, choose Administration > System Settings: Alarms and select or clear the check box next to the link name. |
Link State | Enables an alarm and sends an email notification if an Ethernet link is lost due to an unplugged cable or dead switch port. Depending on which link is down, the system might no longer be optimizing and a network outage could occur. This condition is often caused by surrounding devices, like routers or switches, interface transitioning. This alarm also accompanies service or system restarts on the appliance. For WAN/LAN interfaces, the alarm triggers if in-path support is enabled for that WAN/LAN pair. By default, this alarm is disabled. You can enable or disable the alarm for a specific interface. To enable or disable an alarm, choose Administration > System Settings: Alarms and select or clear the check box next to the link name. |
Memory Paging | Enables an alarm and sends an email notification if memory paging is detected. If 100 pages are swapped every couple of hours, the system is functioning properly. If thousands of pages are swapped every few minutes, contact Riverbed Support at https://support.riverbed.com. By default, this alarm is enabled. |
Neighbor Incompatibility | Enables an alarm if the system has encountered an error in reaching an appliance configured for connection forwarding. By default, this alarm is enabled. |
Network Bypass | Enables an alarm and sends an email notification if the system is in bypass failover mode. By default, this alarm is enabled. |
NFS V2/V4 alarm | Enables an alarm and sends an email notification if the appliance detects that either NFSv2 or NFSv4 is in use. The appliance only supports NFSv3 and passes through all other versions. By default, this alarm is enabled. |
Optimization Service | • Internal Error - Enables an alarm and sends an email notification if the RiOS optimization service encounters a condition that might degrade optimization performance. By default, this alarm is enabled. Go to the Administration > Maintenance: Services page and restart the optimization service. • Service Status - Enables an alarm and sends an email notification if the RiOS optimization service encounters a service condition. By default, this alarm is enabled. The message indicates the reason for the condition. These conditions trigger this alarm: • Configuration errors. • An appliance reboot. • A system crash. • An optimization service restart. • A user enters the no service enable command or shuts down the optimization service from the Management Console. • A user restarts the optimization service from either the Management Console or CLI. • Unexpected Halt - Enables an alarm and sends an email notification if the RiOS optimization service halts due to a serious software error. By default, this alarm is enabled. |
Outbound QoS WAN Bandwidth Configuration | Enables an alarm and sends an email notification if the outbound QoS WAN bandwidth for one or more of the interfaces is set incorrectly. You must configure the WAN bandwidth to be less than or equal to the interface bandwidth link rate. This alarm triggers when the system encounters one of these conditions: • An interface is connected and the WAN bandwidth is set to higher than its bandwidth link rate: for example, if the bandwidth link rate is 100 Mbps, and the WAN bandwidth is set to 200 Mbps. • A nonzero WAN bandwidth is set and QoS is enabled on an interface that is disconnected; that is, the bandwidth link rate is 0. • A previously disconnected interface is reconnected, and its previously configured WAN bandwidth was set higher than the bandwidth link rate. The Management Console refreshes the alarm message to inform you that the configured WAN bandwidth is set greater than the interface bandwidth link rate. While this alarm appears, the system puts existing connections into the default class. The alarm clears when you configure the WAN bandwidth to be less than or equal to the bandwidth link rate or reconnect an interface configured with the correct WAN bandwidth. By default, this alarm is enabled. |
Path Selection Path Down | Enables an alarm and sends an email notification if the system detects that one of the predefined uplinks for a connection is unavailable. The uplink has exceeded either the timeout value for uplink latency or the threshold for observed packet loss. When an uplink fails, the system directs traffic through another available uplink. When the original uplink comes back up, the system redirects the traffic back to it. By default, this alarm is enabled. |
Path Selection Path Probing Error | Enables an alarm and sends an email notification if a path selection monitoring probe for a predefined uplink has received a probe response from an unexpected relay or interface. By default, this alarm is enabled. |
Process Dump Creation Error | Enables an alarm and sends an email notification if the system detects an error while trying to create a process dump. This alarm indicates an abnormal condition where RiOS cannot collect the core file after three retries. It can be caused when the /var directory is reaching capacity or other conditions. When the alarm is raised, the directory is blacklisted. By default, this alarm is enabled. |
Proxy File Service | Enables an alarm and sends an email notification when the system detects a PFS operation or configuration error: • Proxy File Service Configuration - Indicates that a configuration attempt has failed. If the system detects a configuration failure, attempt the configuration again. • Proxy File Service Operation - Indicates that a synchronization operation has failed. If the system detects an operation failure, attempt the operation again. By default, this alarm is enabled. |
Riverbed Host Tools Version | Enables an alarm and sends an email notification when the Riverbed Hardware Snapshot Provider (RHSP) is incompatible with the Windows server version. RHSP provides snapshot capabilities by exposing the Edge through iSCSI to the Windows Server as a snapshot provider. RHSP is compatible with 64-bit editions of Microsoft Windows Server 2008 R2 or later and can be downloaded from the Riverbed Support site at https://support.riverbed.com. Note: We strongly recommend that you upgrade to the latest version of the RHSP tool (available through the Unified Installer for Riverbed Plugins) before upgrading the SteelFusion Edge software. For details, see the SteelFusion Design Guide. By default, this alarm is enabled. |
Secure Transport | Enables an alarm and sends an email notification if a peer appliance encounters a problem with the controller connection. The controller is a SteelHead that typically resides in the data center and manages the control channel and operations required for secure transport between peers. The control channel uses SSL to secure the connection between the peer appliance and the SteelHead controller. • Connection with Controller Lost - Indicates that the peer appliance is no longer connected to the SteelHead controller because: • The connectivity between the peer appliance and the SteelHead controller is lost. • The SSL for the connection is not configured correctly. • Registration with Controller Unsuccessful - Indicates that the peer appliance is not registered with the SteelHead controller, and the controller does not recognize it as a member of the secure transport group. By default, this alarm is enabled. |
Secure Vault | Enables an alarm and sends an email notification if the system encounters a problem with the secure vault: • Secure Vault Locked - Indicates that the secure vault is locked. To optimize SSL connections or to use RiOS data store encryption, the secure vault must be unlocked. Go to Administration > Security: Secure Vault and unlock the secure vault. • Secure Vault New Password Recommended - Indicates that the secure vault requires a new, nondefault password. Reenter the password. • Secure Vault Not Initialized - Indicates that an error has occurred while initializing the secure vault. When the vault is locked, SSL traffic is not optimized and you cannot encrypt the RiOS data store. For details, see Unlocking the secure vault. By default, this alarm is enabled. |
Server Backup | Enables an alarm and sends a notification if the system encounters one of these issues with server-based backups: • Failed connection to the server - Indicates that the connection between the Edge and the ESXi server or vCenter is down, the server is not running, or there are incorrect credentials for the ESXi or vCenter server login. • Backup failure on the Edge - Indicates that a backup has failed on the Edge. The alarm displays a message with the affected server. • NFS export is shared among multiple ESXi servers - Indicates that a server is sharing an export with other servers. • Server with a backup policy does not have an export - Indicates that a server with an associated backup policy does not have any VMs or exports to protect. |
Snapshot | Enables an alarm if a snapshot fails to be commit to the SAN, or a snapshot fails to complete due to Windows timing out. By default, this alarm is enabled. |
Software Compatibility | Enables an alarm and sends an email notification if the system encounters a problem with software compatibility: • Peer Mismatch - Needs Attention - Indicates that the appliance has encountered another appliance that is running an incompatible version of system software. Refer to the CLI, Management Console, or the SNMP peer table to determine which appliance is causing the conflict. Connections with that peer will not be optimized, connections with other peers running compatible RiOS versions are unaffected. • Software Version Mismatch - Degraded - Indicates that the appliance is running an incompatible version of system software. By default, this alarm is enabled. |
SSL | Enables an alarm if an error is detected in your SSL configuration. For details about checking your settings, see Configuring SSL main settings. • Non-443 SSL Servers - Indicates that during a RiOS upgrade (for example, from 8.5 to 9.0), the system has detected a preexisting SSL server certificate configuration on a port other than the default SSL port 443. SSL traffic might not be optimized. To restore SSL optimization, you can add an in-path rule to the client-side appliance to intercept the connection and optimize the SSL traffic on the nondefault SSL server port. After adding an in-path rule, you must clear this alarm manually by entering this command: stats alarm non_443_ssl_servers_detected_on_upgrade clear • SSL Certificates Error (SSL CAs) - Indicates that an SSL peering certificate has failed to reenroll automatically within the Simple Certificate Enrollment Protocol (SCEP) polling interval. • SSL Certificates Error (SSL Peering CAs) - Indicates that an SSL peering certificate has failed to reenroll automatically within the Simple Certificate Enrollment Protocol (SCEP) polling interval. • SSL Certificates Expiring - Indicates that an SSL certificate is about to expire. • SSL Certificates SCEP - Indicates that an SSL certificate has failed to reenroll automatically within the SCEP polling interval. By default, this alarm is enabled. |
SteelFusion Core | Enables an alarm if the system encounters any of these issues with the SteelFusion Core: • The Edge device has connected to a Core that does not recognize the Edge device. • The Edge does not have an active connection with the Core. • The data channel between Core and the Edge is down. • The connection between the Core and the Edge has stalled. By default, this alarm is enabled. |
SteelFusion Protocol Service | Enables an alarm and sends an email notification if an NFS protocol error is preventing a volume from being mounted from the Edge to the clients (for example, ESXi). |
Storage Volume Status | Enables an alarm and sends an email notification if the connection to the volume has failed or there is an issue with: • Backend connectivity • No read/write permissions • Space threshold has been reached • Export filesystem is corrupted or invalid. • Export metadata is corrupted or invalid. • Edge Virtual IP is not configured. • Export filesystem resize has failed. • Export could not be mounted to the client • Connectivity issues between Edge and Core. By default, this alarm is enabled. |
System Detail Report | Enables an alarm if a system component has encountered a problem. By default, this alarm is disabled. |
Temperature | • Critical Temperature - Enables an alarm and sends an email notification if the CPU temperature exceeds the rising threshold. When the CPU returns to the reset threshold, the critical alarm is cleared. The default value for the rising threshold temperature is 70ºC; the default reset threshold temperature is 67ºC. • Warning Temperature - Enables an alarm and sends an email notification if the CPU temperature approaches the rising threshold. When the CPU returns to the reset threshold, the warning alarm clears. • Rising Threshold - Specifies the rising threshold. The alarm activates when the temperature exceeds the rising threshold. The default value is 70 percent. • Reset Threshold - Specifies the reset threshold. The alarm clears when the temperature falls below the reset threshold. The default value is 67 percent. After the alarm triggers, it cannot trigger again until after the temperature falls below the reset threshold and then exceeds the rising threshold again. By default, this alarm is enabled. |
Uncommitted Edge Data | Enables an alarm when a large amount of data in the blockstore needs to be committed to SteelFusion Core. The difference between the contents of the blockstore and the SteelFusion Core-side NFS export is significant. This alarm checks for how much uncommitted data is in the Edge cache as a percentage of the total cache size. This alarm triggers when the appliance writes a large amount of data very quickly, but the WAN pipe is not large enough to get the data back to the SteelFusion Core fast enough to keep the uncommitted data percentage below 5 percent. As long as data is being committed, the cache will flush eventually. The threshold is 5 percent, which for a 4 TiB (1260-4) system is 200 GiB. To change the threshold, use this command: [failover-peer] edge id <id> blockstore uncommitted [trigger-pct <percentage>] [repeat-pct <percentage>] [repeat-interval <minutes>] For example: Core3(config) # edge id Edge2 blockstore uncommitted trigger-pct 50 repeat-pct 25 repeat-interval 5 For details on the CLI command, see the SteelFusion Command-Line Interface Reference Manual. To check that data is being committed, go to Storage > Reports: Blockstore Metrics on the Edge. By default, this alarm is enabled. |
Virtualization | Hypervisor - Enables an alarm when a problem occurs with the hypervisor. |
License - Enables an alarm when the hypervisor license has expired. | |
Operation - Enables an alarm when the hypervisor operation is degraded and is in lockdown mode. | |
Virtual Services Platform - Enables an alarm when a communication issue occurs between VSP and the hypervisor. | |
Connection - Enables an alarm when the hypervisor is not communicating for any of these issues: • VSP is disconnected from the hypervisor. • The hypervisor password is invalid. • VSP was unable to gather some hardware information. • VSP is disconnected. | |
Installation - Enables an alarm when VSP is not installed properly and is powered off for any of these issues: • A hypervisor upgrade has failed. • A configuration set from the installer has failed to be applied to the hypervisor. • VSP could not gather enough information to set up an interface. • The hypervisor is not installed. | |
Web Proxy | Web Proxy Service - Configuration - Enables an alarm when an error occurs with the web proxy configuration. Web Proxy Service - Service Status - Enables an alarm when an error occurs with the web proxy service. By default, this alarm is enabled. |