High Availability WordPress LAMP Stack.

April 13th, 2012

Introduction

In one of my last little tasks at work, I was asked to eliminate single points of failure in the software and hardware stack without spending a fortune on hardware or software licenses. During the process of ensuring high availability (HA), I realized that many small companies might have similar need, but with more pressing tasks and limited man hours, without a post that talks about all the issues and solutions in one place, many companies and organisations tend to leave single points of failure living with the chance that they’re not going to fail any time soon.

I’ve wanted to write this blog post for a while. After you’ve finished reading this blog post, you should have the knowledge to be able to eliminate the single point of failures in hosting a WordPress website. While I’ve chosen WordPress to be the demonstration of this post, the concepts will work with any apache/mysql LAMP stack software. During the course of this tutorial, I’ll run you through it in two parts. The first part is talking about setting up the physical hosts and topology (using Juniper EX2200 switches, SRX100 border firewall’s and ESXi (free) hypervisors for the software stack.) The second part is talking about setting up the software stack to deliver our LAMP stack in a highly redundant fashion.  However what I won’t be doing is providing complete configuration examples.  Instead, please consider this post as an overview to help enlighten you and link you to more specific information to help you set this up in your own environment.

This blog post is split up in two parts, this first post in the series talks about setting up the network infrastructure, the second talks about setting up the software stack.

Part 1 (physical topology)
- Network overview
- Setting up dual Switches
- Setting up ESXi network cards.
- Setting up SRX border firewall’s.

Part 2 (software stack)
- Nginx load balancing proxy
- Web Servers.
- NFS File Server
- MySQL cluster.

Part 1, Physical Topology

If you’re reading this, I’m guessing your current network topology looks something like

mine used to.  You have a single internet connection, single router/firewall, single switch and a bunch of hosts hanging from that switch.  In the event of a system failure, your system administrator (me in this case) will have to hop in a cab and rush to the server room to fix the problem.

The goal with this tutorial is to attempt to help your administrators sleep at night.  We will eliminate every single point of failure such that in the event of a system outage/failure, the system can self recover with at most a minute of unscheduled down time.

Physical Switches

We’ll be replacing the single switch (in my case, unmanaged old gigabit switch) with a pair of managed switches.  Because our bandwidth requirements in this site wasn’t terribly demanding (simple database server, few web, mail servers and the like), a single gigabit Ethernet link to all hosts was all that we required.  If you’re in the same boat I was, I can suggest a pair of Juniper EX2200’s.  If however, you’re going to be pumping some more bandwidth intensive applications through your network (thus require more then gigabit Ethernet connections to the hosts) or have the need for more then a single VLAN and intra-VLAN routing is required (that’s all outside the scope of this tutorial.)  I can strongly recommend you start looking at the EX4200 model switches (set up in a virtual chassis), which can do all your highly available layer 3 IP routing and support multi gigabit Ethernet to your hosts by spanning Ethernet channel across both physical switches.

Active/Passive Switch Design?

Ok, so this heading is a lie, but let me explain.  In my switch design with the EX2200’s, I’m using aggregate Ethernet 802.3ad Etherchannel between my two switches. I’ve opted to use 4 physical ports in my 24 port switches (ge-0/0/20 to ge-0/0/23) to give me a 4Gbit/s backbone between them. Obviously with gigabit Ethernet right through the network this isn’t much bandwidth, so the idea is to keep as little data traveling that link as possible (network broadcasts only hopefully!)

First, the following configuration configures the aggregate Ethernet link between the switches:

chassis {
    aggregated-devices {
        ethernet {
            device-count 1;
        }
    }
}

[interfaces]
    ge-0/0/20 {
        ether-options {
            802.3ad ae0;
        }
    }
    ge-0/0/21 {
        ether-options {
            802.3ad ae0;
        }
    }
    ge-0/0/22 {
        ether-options {
            802.3ad ae0;
        }
    }
    ge-0/0/23 {
        ether-options {
            802.3ad ae0;
        }
    }
    ae0 {
        aggregated-ether-options {
            lacp {
                passive;
            }
        }
        unit 0 {
            family ethernet-switching {
                port-mode trunk;
                vlan {
                    members all;
                }
            }
        }
    }

The only thing you have to watch out for, is that on at least one switch, you set the lacp mode to be active.

Setting up ESXi Network

In my environment, I’m using the free version of ESXi 4.1. I understand that you may wish to be connecting linux hosts directly. If you’re directly connecting linux hosts, I recommend you look at creating a Bonding Adaptor.  Thought without the more expensive EX4200’s, we’re best sticking with an active-backup setup and should stick with mode=1.

For the ESXi configuration, it’s pretty simple. We connect two NIC’s to the switches. A primary NIC to the primary switch, and a secondary NIC to the secondary switch.

The steps on configuring the vSwitch (virtual switch) are pretty simple. Virtual machines on these ESXi hosts won’t have to do anything special once this setup has been done to take advantage on the physical machines, they’ll just take advantage of our HA network topology.

First, ensure that we have two NIC’s setup on our vSwitch.

Next, look at the following configuration properties I’ve made to the NIC teaming information.  This basic configuration will ensure that when the link on vmnic0 is available (our active switch), it’ll use it.  When the link becomes unavailable, it will fail over to vmnic1.

Setting up border firewall’s

The last step we’ve got in our highly available infrastructure here is our border firewall’s. Please bare with me on this section, it is the most complicated and there are a few technologies introduced. If there is something I don’t explain completely, please feel free to leave a comment below and I’ll try and explain it better.  I expect you’ll have to jump back and forth between Wikipedia and this article to fully understand what it is we’re doing. Explaining the concepts behind BGP and autonomous systems is beyond the scope of this article.

In my role, I replaced an active/backup GNU based firewall solution (backup being run down to the data center as fast as possible and swap the cables over) with two Juniper SRX100’s configured in an active/passive configuration under a chassis cluster (see SRX100 high availability deployment guide at http://kb.juniper.net/InfoCenter/index?page=content&id=KB15669). We had get a second internet connection installed, then make sure that any one link (or firewall itself) can die and still have the system self recover. There are two major tasks that has to happen to make this gateway highly available. Firstly, our internal network hosts will be using one of these firewall’s as their default gateway. If the link to the primary switch, or the primary firewall itself should die, we still want the network hosts to be able to reach their gateway. We will achieve this with “redundant Ethernet” on the cluster. If you’ve come from the cisco networking world before, picture something like this to be like VRRP for the inside hosts. If the main link fails, the MAC and IP address will float over to the other physical port.

Let me give a bit more detail to our new example network topology here in this example so you can gain a better idea of how these settings work.

For the internal gateway address to fail over to the other switch should your link off Firewall1 die, You’ll want to make the following configuration

chassis {
    cluster {
        reth-count 2;
        redundancy-group 0 {
            node 0 priority 100;
            node 1 priority 1;
        }
        redundancy-group 1 {
            node 0 priority 254;
            node 1 priority 1;
            preempt;
            interface-monitor {
                fe-0/0/1 weight 255;
                fe-1/0/1 weight 255;
            }
        }
    }
}
interfaces {
    fe-0/0/1 {
            fastether-options {
                redundant-parent reth1;
            }
    }
    fe-1/0/1 {
            fastether-options {
                redundant-parent reth1;
            }
    }
}

You can see redundancy-group 1 is monitoring the local interfaces going back to the switches.

For the dual WAN links, I won’t go into much detail, but you’ll want to ask your ISP for a second internet connection. Some providers offer a cheap link that they only charge you for once you start flowing data over it (sometimes called a Shadow link). This is perfect as you can flow all your traffic through your primary internet connection, then on failure of it, you’ll move your traffic through the secondary. If you wanted complete redundancy, you could apply for a domain independent subnet (has to be a class C to advertise on world BGP tables) and your own ASN. This will let you use two different internet service providers.

In my case, I’m creating a redundant connection using the same ISP, so I’ve asked for a private ASN to be allocated (see http://en.wikipedia.org/wiki/Autonomous_System_(Internet) ).

For a small network such as ours (especially using the small base level SRX’s) you’ll want to ask your provider to advertise only the default route to you. In turn, you’ll advertise your network’s address space on both connections. On the event that a link dies, the BGP peer on the other end no longer receives updates from you and will no longer attempt to route to it.

The following configuration extract shows how we’d configure out SRX firewall’s to peer with our ISP’s routers. Things of note is that our primary link (the one on the left) has a lower metric-out then the shadow link, meaning a lower MED attribute is sent to our ISP and thus inbound traffic will, by preference use the main connection. The preference values under neighbor will determine the preference we will send traffic under that connection for outbound traffic.

routing-options {
    autonomous-system 64512
}

protocols {
    bgp {
        group ISP {
            metric-out 50;
            local-address 1.1.1.2;
            import ISP-in;
            export ISP-out;
            neighbor 1.1.1.1 {
                preference 170;
                peer-as 123;
            }
        }
        group ISP-shadow {
            metric-out 100;
            local-address 1.1.1.6;
            import ISP-in;
            export ISP-out;
            neighbor 1.1.1.5 {
                preference 180;
                peer-as 123;
            }
        }
    }
}
policy-options {
    policy-statement ISP-in {
        term default-in {
            from {
                route-filter 0.0.0.0/0 exact;
            }
            then accept;
        }
        term block {
            then reject;
        }
    }
    policy-statement ISP-out {
        term tnziPublic {
            from {
                protocol direct;
                route-filter 2.2.2.0/26 exact;
            }
            then accept;
        }
    }
}

27 responses

  1. George comments:

    Hi Scott, thanks for the great how-to..

    I had one question in case you could help. I have 2x EX2200-48 switches connected via SFP LC and I want to use them in a VMware vCloud lab i’m setting up. The vCloud architecture will have 10 VLANs. The ESXi hosts and the storage will be connecting to the vLANs via 4 vSwitches

    Do you think its a good idea to use LACP in this setup, to increase the bandwidth and achieve better redundancy? or maybe it’s not possible with the base license of my EX2200?

    This is the list of VLAN’s that are required:
    VLAN 01 – The load balancer external network, solely for the external virtual IPs of the load balancers (load balancers are setup in a HA pair)
    VLAN 02 – The DMZ, is where the vCIM and vCloud Director cells reside, they are the publicly accessible API endpoint.
    VLAN 03 – The external connection of the entire environment to the Internet.
    VLAN 04 – The primary private management network, virtual machines on this subnet use the central firewall as a default gateway
    VLAN 05 – The main management VLAN for the ESXi hosts in the vCIM environment. All the ESXi hosts, both management and resource, are presumed to have a primary vmkernel interface on this interface.
    VLANs 06 & 07 – Used for Fault Tolerance and VMware vMotion™ respectively. Unrouted, These VLANs are only for the ESXi hosts to pass vMotion and Fault Tolerance traffic.
    VLAN 08 – The storage network. VLAN H is private, RFC1918 address space, and is unrouted and has no default gateway, as only the ESXi hosts should need to communicate with the storage equipment.
    VLAN 09 – The tenant external network
    VLAN 10 – Used for vCloud Director network isolation-backed (VCD-NI) encapsulated traffic.

  2. scottyob comments:

    Hi George,

    LACP across the EX2200′s isn’t possible. For instance, if you wanted to put 2x GE into Switch1 and 2x GE into Switch2 for LACP across the 4, you’d need the EX4200′s. I believe what you can do however is LACP in two pairs, first as primary, the 2nd as backup. This should give you twice as much bandwidth on the ESX as just single gigabit ethernet.

    Hope it helps :)

  3. George comments:

    Thanks for writing back Scott :-) I will give it a go!

  4. george comments:

    Hi Scott! now with JunOS 12.2R1 the EX2200 can be setup in a Virtual Chassis via the SFP fiber uplink ports! do you think it’s worth it to do LACP across the EX2200-VC?

  5. watch house online season 8 episode 2 comments:

    But a smiling visitor here to convey the love (:, btw great pattern.

  6. Marcus comments:

    Hi Scott, I have a similar config.

    2x 3200′s configured with VRRP (LACP trunk between both switches). Each host has an IPHASH 4 port trunk back to the Junipers.

    Each of my hosts currently connects into only one switch (4 port AE) but I would like to have each host connect to both switches (for redundancy). When I move the host onto both switches (2 port AE on each switch) i suddenly see duplicate packets and packet loss when pinging any of my VM’s on either of the hosts, from either of the switches…

    I’m using IP hash, Link status only, Notify Switches and No Failback.

    Any ideas would be welcome :)

  7. scottyob comments:

    Hey Marcus!

    I’ve got an inkling you’ve still got 4 port AE set up on your hosts back to 2 separate AE’s on the switches? If this is the case, the hashing algorithm used might cause the MAC address for the hosts/VM’s to flap from port channel to port channel on both the switches. To test this theory, I’ve never done it before, but you might want to check this: http://www.juniper.net/techpubs/en_US/junos10.4/topics/reference/command-summary/show-ethernet-switching-mac-learning-log-bridging-ex-series.html

    I’m thinking the way to fix this would be to either setup on your hosts two AE groups, then make one active and one standby (my setup, thus keeping all active traffic on the one switch except for broadcasts, meaning the inter-switch link should not be terribly busy. Another neat solution would be to set the EX3200′s in a “virtual chassis”, meaning they act as one logical switch. This way you would have a single 4x AE that spans across both switches.

    Good luck and let me know how you go!

    - Scotty O

  8. Offres D Emploi A Montreal comments:

    I think that what you said was actually very reasonable.
    However, what about this? suppose you were to write a awesome post title?
    I ain’t suggesting your content is not solid, but what if you added something that grabbed people’s attention?
    I mean Scott O’Brien

  9. Test Remunere comments:

    Hi, I want to subscribe for this website to take most
    recent updates, so where can i do it please help.

  10. gagner de l'argent comments:

    This is my first time pay a quick visit at here and i am
    truly happy to read everthing at alone place.

  11. gagner argent facilement comments:

    Hello, all the time i used to check webpage posts here early in the dawn,
    because i love to find out more and more.

  12. sondage rémunéré bon plan comments:

    What a data of un-ambiguity and preserveness of
    precious familiarity on the topic of unexpected feelings.

  13. mike comments:

    How did you define the p2p’s to the ISP’s with a single RETH? Where were the local address under the BGP definded? secondary on the reth or loopbacks

  14. scottyob comments:

    Good point. I have missed that by the look of it. In this case with BGP between us and our transit, we just used normal units under both SRX’s (no Reth between us and the ISP). Reth is purely between the servers and firewall, no loopbacks required here (unless you wanted to specify the router-id for BGP, but not 100% necessary). The MED and Local-Pref influences traffic routed in and out to be on the ‘active’ ethernet link between us and our ISP.

  15. Florian comments:

    Attractive section of content. I simply stumbled upon your weblog and in accession capital to assert that I get in fact enjoyed account your
    blog posts. Anyway I will be subscribing for your augment or even I achievement you get right of entry to persistently rapidly.

  16. gagner de l'argent avec options binaires comments:

    Un n’reste approuvai qu’la attribue trésorière afin secteur Sur l’Degré infinie

  17. Kristie comments:

    Market place liquidity and volatility aside, its
    clear a lot of clients bought a item that performed really distinctive from advertised.

  18. Https://Storify.Com/Salttext1/Top-Nine-Funny-Csr-Classics-Hack-Quotes tracks back:

    https://Storify.Com/Salttext1/Top-Nine-Funny-Csr-Classics-Hack-Quotes

  19. csr classics hack iphone jailbreak tracks back:

    csr classics hack iphone jailbreak

  20. web site comments:

    I blog quite often and I genuinely appreciate your information. The article has truly peaked my
    interest. I’m going to bookmark your blog and keep
    checking for new details about once a week. I subscribed to
    your Feed as well.

  21. can you make your dick bigger comments:

    Hello there! I could have sworn I’ve been to this website before but after looking
    at many of the articles I realized it’s new to me. Anyhow, I’m definitely delighted I discovered it and I’ll be
    book-marking it and checking back regularly!

  22. Cara meningkatkan kualitas sperma yg baik tracks back:

    Cara meningkatkan kualitas sperma yg baik

    Scott O'Brien » High Availability WordPress LAMP Stack.

  23. เช่าชุดราตรีสั้น comments:

    Hello, I would like to subscribe for this webpage to obtain most recent updates,
    therefore where can i do it please help.

  24. Bella Villa Cabana Hotel Pattaya Pantip tracks back:

    Bella Villa Cabana Hotel Pattaya Pantip

    Scott O'Brien » High Availability WordPress LAMP Stack.

  25. video guides comments:

    Great article! We are linking to this particularly great post on our site.
    Keep up the great writing.

  26. fitness tracks back:

    fitness

    blog topic

  27. New Road Bikes 2017 tracks back:

    New Road Bikes 2017

    Scott O'Brien » High Availability WordPress LAMP Stack.

Leave a comment