Just finishing off a few things at work this week. We’ve got a few sites around the place where we have HA internet powered by two Juniper SRX100′s. The Two SRX100′s operate in a Chassis Cluster and peer with our ISP using BGP across both active/passive devices.
This script is a little Nagios check script that I wrote to hook into our in-house Nagios monitoring platform. It makes sure the chassis cluster has not failed over operating in a degraded state, and makes sure that there are two BGP peers connected.
NOTE: I was aiming for simplicity in this setup, if you’ve got a bigger environment or require instant notifications you might wish to set up snmp traps to get instant notifications.
# Bash script to check the status of a SRX cluster.
# Works by SSHing into cluster to check "show chassis cluster status" command and SNMP walking to make sure BGP peers
# are both in a connected state
clusterStatus=`ssh nagios@$clusterAddress -i $privateKey "show chassis cluster status"`
declare -i primaryCount
declare -i secondaryCount
declare -i failoverCount
declare -i activeBgpPeers
activeBgpPeers=`snmpwalk -Os -c public -v 1 $clusterAddress .22.214.171.124.126.96.36.199.1.2 | grep "INTEGER: 6" | wc -l`
primaryCount=`echo "$clusterStatus" | grep primary | wc -l`
secondaryCount=`echo "$clusterStatus" | grep secondary | wc -l`
failoverCount=`echo "$clusterStatus" | grep "Failover count: 0" | wc -l`
if [ $primaryCount -ne 2 ]
echo "No two primary redundancy groups"
if [ $secondaryCount -ne 2 ]
echo "No two secondary redundancy groups"
if [ $failoverCount -ne 2 ]
echo "SRX has fallen over on a redundancy group"
if [ $activeBgpPeers -ne 2 ]
echo "NOT 2 Active BGP Peers"
echo "OK, 2 peers. OK: Chassis Cluster status OK"