Scott O'Brien

Ramblings and resources of my online life

My experiences of managing a Cisco switch with Puppet

2012-12-08 12:19:41 +0000 +0000

One recent pet gripe of mine has been having to add a new VLAN into our datacenter for our vSphere platform.  Not that I trust my DCs switches with puppet just yet, this is a proof of concept post about how we could be using puppet to centrally manage this configuration and push it out across our DC.

Before

We’ve got a pretty basic topology going on in our DC, it’s just a VSS with the other switches pretty much being nothing more but layer 2 for the most part.  The dot1q trunk back to the VSS carries all VLANs from our end of row switches.  When we add a new vlan in the DC to trunk to the ESX machines, we would add the VLAN in all the DC switches (not running VTP) then add the vlan to the trunk port on each port patched to the ESX hosts.  (we’re not using any link aggregation to the ports connected to the same ESX host, the ESX hosts themselves have their own load balancing method.. If you know any for/against doing it like this please comment and let me know)

Setting up the Puppet lab

Introduced in Puppet 2.7 is network device management.  This more or less is an expect script to manage interfaces and vlans on IOS devices.  For this lab, we will be using cisco IOU with the following topology

Setting up the Devices

Ideally, you would have a few puppet nodes that manage a few devices each to spread out the load, for the purposes of this exercise, I created a single vm running Centos6 with both puppet-server and puppet installed.  For this machine to manage the switches, I added the following into the device.conf file

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
[dc_sw1]
  type cisco
  url telnet://puppet:[email protected]/
[dc_sw2]
  type cisco
  url telnet://puppet:[email protected]/
[dc_sw3]
  type cisco
  url telnet://puppet:[email protected]/
[dc_sw4]
  type cisco
  url telnet://puppet:[email protected]/

Signing the devices

To update the devices, you have to run puppet device.  The first time you run it, a certificate will be created that needs to be signed on the puppet master.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
info: starting applying configuration to dc_sw4 at telnet://puppet:[email protected]/
info: Creating a new SSL key for dc_sw4
warning: peer certificate won't be verified in this SSL session
info: Caching certificate for ca
warning: peer certificate won't be verified in this SSL session
warning: peer certificate won't be verified in this SSL session
info: Creating a new SSL certificate request for dc_sw4
info: Certificate Request fingerprint (md5): E8:A6:35:9D:BF:CE:3D:BC:E0:E4:C2:5B:00:CE:9F:DB
warning: peer certificate won't be verified in this SSL session
warning: peer certificate won't be verified in this SSL session
warning: peer certificate won't be verified in this SSL session

so we’ll need to sign our devices

1
2
notice: Signed certificate request for dc_sw4
notice: Removing file Puppet::SSL::CertificateRequest dc_sw4 at '/var/lib/puppet/ssl/ca/requests/dc_sw4.pem'

Setting up the switches for Puppet

If you look up to the device configuration file, we need to create a local user for puppet to log into the switch (remember, it acts much like an expect script)

1
2
3
4
5
6
line vty 0 4
 privilege level 15
 password cisco
 login local
 transport input all
!

The Configuration

so with no more ado, we can easily simply abstract the behaviour of these ports using puppet syntax 🙂

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

        interface {
                "${port}":
                        mode => trunk,
                        duplex => full,
                        description => "ESX Host",
                        allowed_trunk_vlans => "3,4,5,8,9"
        }

}

node "dc_sw1" {

        esxport { 'e0/2': port => 'Ethernet0/2' }
        esxport { 'e0/3': port => 'Ethernet0/3' }

}

node "dc_sw2" {

        esxport { 'e1/0': port => 'Ethernet1/0' }
        esxport { 'e1/1': port => 'Ethernet1/1' }
        esxport { 'e1/2': port => 'Ethernet1/2' }
        esxport { 'e1/3': port => 'Ethernet1/3' }

}

The Ugly

A lot of the states don’t yet seem to be supported by this module.  This means even the default trunk mode of dynamic desirable will cause issues when Puppet is pulling device information and you’ll have to manually specify “switchport trunk encapsulation dot1q” and “switchport mode access” before setting puppet free on the devices.

Results

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
info: starting applying configuration to dc_sw4 at telnet://puppet:[email protected]/
info: Caching catalog for dc_sw4
info: Applying configuration version '1355007108'
notice: Finished catalog run in 0.20 seconds
info: starting applying configuration to dc_sw3 at telnet://puppet:[email protected]/
info: Caching catalog for dc_sw3
info: Applying configuration version '1355007108'
notice: Finished catalog run in 0.21 seconds
info: starting applying configuration to dc_sw2 at telnet://puppet:[email protected]/
info: Caching catalog for dc_sw2
info: Applying configuration version '1355007108'
notice: /Stage[main]//Node[dc_sw2]/Esxport[e1/1]/Interface[Ethernet1/1]/description: defined 'description' as 'ESX Host'
notice: /Stage[main]//Node[dc_sw2]/Esxport[e1/1]/Interface[Ethernet1/1]/duplex: duplex changed 'auto' to 'full'
notice: /Stage[main]//Node[dc_sw2]/Esxport[e1/1]/Interface[Ethernet1/1]/mode: mode changed 'access' to 'trunk'
notice: /Stage[main]//Node[dc_sw2]/Esxport[e1/1]/Interface[Ethernet1/1]/allowed_trunk_vlans: defined 'allowed_trunk_vlans' as '3,4,5,8,9'
notice: /Stage[main]//Node[dc_sw2]/Esxport[e1/3]/Interface[Ethernet1/3]/description: defined 'description' as 'ESX Host'
notice: /Stage[main]//Node[dc_sw2]/Esxport[e1/3]/Interface[Ethernet1/3]/duplex: duplex changed 'auto' to 'full'
notice: /Stage[main]//Node[dc_sw2]/Esxport[e1/3]/Interface[Ethernet1/3]/mode: mode changed 'access' to 'trunk'
notice: /Stage[main]//Node[dc_sw2]/Esxport[e1/3]/Interface[Ethernet1/3]/allowed_trunk_vlans: defined 'allowed_trunk_vlans' as '3,4,5,8,9'
notice: /Stage[main]//Node[dc_sw2]/Esxport[e1/2]/Interface[Ethernet1/2]/description: defined 'description' as 'ESX Host'
notice: /Stage[main]//Node[dc_sw2]/Esxport[e1/2]/Interface[Ethernet1/2]/duplex: duplex changed 'auto' to 'full'
notice: /Stage[main]//Node[dc_sw2]/Esxport[e1/2]/Interface[Ethernet1/2]/mode: mode changed 'access' to 'trunk'
notice: /Stage[main]//Node[dc_sw2]/Esxport[e1/2]/Interface[Ethernet1/2]/allowed_trunk_vlans: defined 'allowed_trunk_vlans' as '3,4,5,8,9'
notice: /Stage[main]//Node[dc_sw2]/Esxport[e1/0]/Interface[Ethernet1/0]/description: defined 'description' as 'ESX Host'
notice: /Stage[main]//Node[dc_sw2]/Esxport[e1/0]/Interface[Ethernet1/0]/duplex: duplex changed 'auto' to 'full'
notice: /Stage[main]//Node[dc_sw2]/Esxport[e1/0]/Interface[Ethernet1/0]/mode: mode changed 'access' to 'trunk'
notice: /Stage[main]//Node[dc_sw2]/Esxport[e1/0]/Interface[Ethernet1/0]/allowed_trunk_vlans: defined 'allowed_trunk_vlans' as '3,4,5,8,9'
notice: Finished catalog run in 14.22 seconds

my $0.02

I must say, I’m very disappointed in this module so far.  It shows great promise and makes a once tedious task relatively effortless to manage, however with the time invested to find out what is and what is not supported, I think it’s far too early to invest in such a solution.  The idea of setting something like an expect script loose on my kit also worries me.  It’s much better to have an API or a promise that the input/output the expect script uses won’t change in a future release then do something unexpected (pun intended there.)

I guess if we were using an OS like Junos we could have created apply-groups like this to abstract the configuration in much the same manner, at least down to the switch level.  Very interesting for a new take on managing these things though

EDIT:

I’ve been thinking about this a lot since I posted it.  I think I was too harsh on the tool.  It seems even Cisco’s own tools work by ssh’ing into the box to make their changes.. while not ideal, for these old IOS devices around, it seems to be the accepted thing to do.  It’s exciting times ahead in this space though, I can feel it!

Address Book For the Blind

2012-11-09 00:32:51 +0000 +0000

This article is about using the Asterisk PBX and exploiting Google’s voice recognition API built for voice search in Chrome to build an address book that technology inept people (my grandmother) can use to place cheap telephone calls over VoIP.

This tool is built for my grandmother; a lady who has macular degeneration making her legally blind.  She doesn’t want to invest a great deal of money in this solution or have much of a learning curve, it took long enough to get her using the two button audio book solution on the iPod.  

The basic idea is to purchase a direct inward dialing (DID) number and program it into her speed dial, this will connect to an Asterisk virtual machine that will launch the voice recognition to listen to who she wants to dial, look up the number in her address book then connect the call through, all at the rate of about $0.11 for a national call (actually saving her money!).

I’m using the Speech Recognition for Asterisk module to convert the spoken recording to a string.  The problem with the method using this library for an address books is different words being returned that are pronounced very similar to the names in the book, for instance trying to find the name ‘elmer’ returned ‘Alma’.  With a bit of python scripting to use fuzzy matching to search by sound with python we can start to compare what our speech recognition engine returned with what we actually have in our address book. It must be noted that this method only works for a small set of different numbers in an address book, which although fits my grandmother fine, I plan to script the server so I can always make that magic button (DID speed dial) re-direct the call to anywhere I like.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
 dmetaphone = fuzzy.DMetaphone(3)
 dmetaphone('Alma')
['ALM', None]
 dmetaphone('Elmer')
['ALM', None]
 dmetaphone('Toss')
['TS', None]
 dmetaphone('Scott')
['SKT', None]
 dmetaphone('Troy')
['TR', None]
 dmetaphone('Sue')
['S', None]
 dmetaphone('Janet')
['JNT', 'ANT']

So, to execute the plan, I’m using an account on Mynetfone.  The extensions.conf has a script to answer the call on my purchased DID, prompt for who she wants to talk to, run it through speech recognition, then call the number in her address book through a python script.  The extensions.conf that makes this magic happen looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
; Phone call is made to the service, welcome user and get their speech
exten = 0920xxxx,1,Answer()
exten = 0920xxxx,1,Wait(2)
exten = 0920xxxx,n,agi(googletts.agi, "Hello Miss Duckworth", en)
exten = 0920xxxx,n,agi(googletts.agi, "I am here to attempt to make your social life easier.", en)
exten = 0920xxxx,n(intro),agi(googletts.agi, "Who would you like to talk to?", en)
exten = 0920xxxx,n,agi(speech-recog.agi, en-US)

; Look up number in address book, write debug information then go to the required section
exten = 0920xxxx,n,agi(address-book.agi,${utterance})
exten = 0920xxxx,n,Verbose(1,The text you just said is: ${utterance})
exten = 0920xxxx,n,Verbose(1,The probability to be right is: ${confidence})
exten = 0920xxxx,n,Verbose(1, ${foundname})
exten = 0920xxxx,n,Verbose(1, ${todial})
exten = 0920xxxx,n,GotoIf($["${foundname}" = "1"]?success:fail)

; Name not found in the address book
exten = 0920xxxx,n(fail),agi(googletts.agi,"Sorry, I could not find the person ${utterance} in the address book", en)
exten = 0920xxxx,n,agi(googletts.agi, "Please feel free to try again",en)
exten = 0920xxxx,n,Goto(intro)

; Connect the user through the MyNetFone service.
exten = 0920xxxx,n(success),agi(googletts.agi, "Please hold while I connect you to", en)
exten = 0920xxxx,n,agi(googletts.agi, "${utterance}", en)
exten = 0920xxxx,n,Dial(SIP/MyNetFone/${todial}, 30)
exten = 0920xxxx,n,Hangup()

and the python script address-book.agi file:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41

import sys
import fuzzy

# Read and ignore AGI environment (read until blank line)

addressBook = {
   'scott': '0415xxxxxx',
   'susan': '98xxxxxx',
   'support': '181'
}

dmetaphone = fuzzy.DMetaphone(3)

env = {}
tests = 0;

while 1:
   line = sys.stdin.readline().strip()

   if line == '':
      break
   key,data = line.split(':')
   if key[:4] < 'agi_':
      #skip input that doesn't begin with agi_
      sys.stderr.write("Did not work!\n");
      sys.stderr.flush()
      continue
   key = key.strip()
   data = data.strip()
   if key < '':
      env[key] = data

spokenWord = dmetaphone(env['agi_arg_1'])

for name in addressBook:
   if dmetaphone(name) == spokenWord:
      print 'SET VARIABLE todial "%s"' % addressBook[name]
      print 'SET VARIABLE foundname "1"'
      sys.exit(0)
print 'SET VARIABLE foundname "0"'

ZFS and Apple Time Machine, a perfect team

2012-07-31 12:09:51 +0000 +0000

So lately I’ve been thinking about my backup strategy on my Mac. From previous posts you might know I’ve build my OpenIndiana ZFS FileServer. Well, just created a volume and decided to put 300GB to good use to create a time machine on my mac. There is a brilliant guide on how to do it here and suggest you all take a look (Thanks for the awesome guide Marco).

Monitoring SRX Chassis Cluster

2012-07-09 04:44:35 +0000 +0000

Just finishing off a few things at work this week.  We’ve got a few sites around the place where we have HA internet powered by two Juniper SRX100’s.  The Two SRX100’s operate in a Chassis Cluster and peer with our ISP using BGP across both active/passive devices.

Below is a little Nagios check script that I wrote to hook into our in-house Nagios monitoring platform.  It makes sure the chassis cluster has not failed over operating in a degraded state, and makes sure that there are two BGP peers connected.

NOTE:  I was aiming for simplicity in this setup, if you’ve got a bigger environment or require instant notifications you might wish to set up snmp traps to get instant notifications.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54

# Bash script to check the status of a SRX cluster.
#  Works by SSHing into cluster to check "show chassis cluster status" command and SNMP walking to make sure BGP peers
#  are both in a connected state

STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
STATE_UNKNOWN=3

clusterAddress=$1
privateKey=$2
clusterStatus=`ssh [email protected]$clusterAddress -i $privateKey "show chassis cluster status"`

declare -i primaryCount
declare -i secondaryCount
declare -i failoverCount
declare -i activeBgpPeers

activeBgpPeers=`snmpwalk -Os -c public -v 1 $clusterAddress .1.3.6.1.2.1.15.3.1.2 | grep "INTEGER: 6" | wc -l`
primaryCount=`echo "$clusterStatus" | grep primary | wc -l`
secondaryCount=`echo "$clusterStatus" | grep secondary | wc -l`
failoverCount=`echo "$clusterStatus" | grep "Failover count: 0" | wc -l`

if [ $primaryCount -ne 2 ]
then
        echo "No two primary redundancy groups"
		echo "$clusterStatus"
        exit $STATE_CRITICAL
fi

if [ $secondaryCount -ne 2 ]
then
        echo "No two secondary redundancy groups"
		echo "$clusterStatus"
        exit $STATE_CRITICAL
fi

if [ $failoverCount -ne 2 ]
then
        echo "SRX has fallen over on a redundancy group"
		echo "$clusterStatus"
        exit $STATE_WARNING
fi

if [ $activeBgpPeers -ne 2 ]
then
        echo "NOT 2 Active BGP Peers"
        exit $STATE_CRITICAL
fi

echo "OK, 2 peers.  OK: Chassis Cluster status OK"
echo "$clusterStatus"
exit $STATE_OK