For quite a while I had issues with a faulty service module in a remote Cisco. Sometimes the card would lock-up and no longer respond and the only way to revive it was to issue the command service-module wlan-ap 0 reset
to reboot it.
Normally one would simply replace it but that wasn’t that simple as the card itself can’t be replaced (and the device is also EOL) and the device is in a remote location without any technical staff on-site. Since rebooting the service module resolved the issue for that moment I looked for a way to automate that and documented that in this post.
Fortunately it was quite easy to detect that the service module no longer functioned as it had its own IP address. This allowed to define an IP SLA to ping the service module’s IP address. As I want this to be detected quickly I set the frequency to 30 seconds. This is setup using the following configuration:
ip sla 1
icmp-echo 192.168.0.2
frequency 30
Next instruct IOS to track this ip sla for reachability with:
track 1 ip sla 1 reachability
Now start the sla monitoring of the sla with:
ip sla schedule 1 life forever start-time now
With this setup, the next step is to attach an event-manager applet to respond when the sla tracking detects that the service module is down (i.e. no longer reachable). This can be achieved with the following configuration:
event manager applet TRACK_WLAN-AP_DOWN
event track 1 state down ratelimit 300
action 1.0 syslog priority alerts msg "WLAN-IP is unreachable, attempting service module reset"
action 2.0 cli command "enable"
action 3.0 cli command "service-module wlan-ap 0 reset" pattern "confirm"
action 3.1 cli command "y"
Line-by-line explanation of the applet:
- define the applet with name
TRACK_WLAN-AP_DOWN
- have the applet triggered when the state of
tracker 1
isdown
, rate-limit this to once every 5 minutes - log a message to the syslog so that it is visible the applet responded to the event
- get into privileged mode
- send the
service-module reset
command and wait for the text confirm to get back (the command prompts for a confirmation) - respond to the confirmation prompt with
y
Please note that if the event manager was not used / configured before, you also need to set the userID for event manager sessions (replace admin
with an administrative user on the Cisco) with:
event manager session cli username "admin"
That’s all! This configuration has the Cisco ping the service-module’s IP address (but this can be used to monitor any IP address) every 30 seconds and monitor its status. When it goes down the TRACK_WLAN-AP_DOWN
applet will be triggered.
The current status of the monitoring can be checked with the commands show ip sla summary
and show track
that will produce output like this:
router#show ip sla summary
IPSLAs Latest Operation Summary
Codes: * active, ^ inactive, ~ pending
ID Type Destination Stats Return Last
(ms) Code Run
-----------------------------------------------------------------------
*1 icmp-echo 192.168.0.2 RTT=1 OK 26 seconds ago
router#show track
Track 1
IP SLA 1 reachability
Reachability is Up
3 changes, last change 02:51:02
Latest operation return code: OK
Latest RTT (millisecs) 1
Tracked by:
EEM applet TRACK_WLAN-AP_DOWN
As you can see in the output of the show track
command it confirms that the EEM applet is tracking the status.
Besides checking the syslog it is also possible to see when the even triggered by using the command:
event manager history events
So.. a simple solution to at least get rid of the calls at inconvenient moments to reboot the service module though no real solution. So far the solutions turned out to be pretty reliable and allows to defer the replacement of the router till someone is on-site to do it.