Facebook (Meta) Production Network Engineer
July 2015 - June 2024 (9 years), Menlo Park, CA, USA
Easily the pinnacle of my career so far. Spent 9 years building and exercising my skills in Network Engineering, Software Development and Data Analytics working across multiple teams over my tenure here to help build and scale Meta’s Network Infrastructure.
- Built alarms and remediation to scale the automated management across Facebook’s Datacenter, Backbone and Edge Networks.
- Built the Drain and audit frameworks to ensure the safe removal and insertion of network devices in all devices roles across the production network.
- Built data pipelines to show issues with the convergence timing of our MPLS/RSVP network after fiber cut events.
- Ran incident response for a number of issues across both network and software tools.
- Deployed and managed multiple Terragraph instances to help operationalize and harden the product.
- Partnered with edge and backbone teams, built tooling and workflows to make their turnup and migration processes more consistent and reliable
- Worked with partner Software Development Teams developing the workflow orchestration systems to help productionize their service. Add monitoring, find bottlenecks, and integrate with wider Facebook tooling to increase the responsiveness and reliability of their service.
- Built tooling to integrate different teams databases to detect physical cable faults and create the appropriate followup actions.
- Designed and implemented RFC4884 and RFC5837 on FBOSS network OS to retain IPv4 traceroute functionality across newer V6-only deployments.
- Built tooling to parse verbose FBOSS switch ASIC state logs, extracting millisecond-granular data on resource usage, convergence timing, and routing micro-loops. This reduced time to triage hard-to-root-cause incidents, aided qualification efforts, and helped identify bottlenecks to inform future network design roadmaps.
- Helped team take control of on-call burden, build action plan with partner teams to bring alarming down to acceptable levels.
- Ran multiple Datacenter related incident investigations and mitigations.
Independent Contractor
October 2014 - June 2015 (9 months), Sydney Australia
Work here was contracting for primarily two different companies. Cinenet Systems (now acquired by Superloop) and Rise.ph, a new Philippine ISP starting up.
- Deploying new services across a passive DWDM network across Sydney and Melbourne.
- Troubleshooting MPLS (VLL and MPLS) issues with existing customers.
- Build and provision backend systems (Chef, Radius & Bind) to build the infrastructure to support initial deployments.
- Design BGP Communities and policies to influence traffic through network and peers.
- Manage initial ASN and IP allocations through APNIC.
University of Wollongong Network Engineer
June 2012 - Sep 2014 (2 years 4 months), Wollongong, NSW, Australia
Worked primarily as a Network Engineer and Software Developer to keep the university campus and datacenter networks operating smoothly and help improve processes in the organization through the development of software. Work here involved:
- Manage, design and implement upgrades to the multi-campus MPLS VPN core
- Deployment of open source tools such as NetDisco and Rancid with custom scripting to improve change management processes
- Implement new quota and proxy “Free Internet” deployment with BGP community based shaping rules to network appliances to satisfy business financial needs, along with Traffic Attribution tools for the business to understand usage and costs down to a per-subscriber breakdown.
- Design network migration and lab changes to migrate to newer Palo Alto based firewalls for allowing inter-vrf routing.
University of Wollongong Academic Tutor
Roles were assisting student with lab tasks, providing class assistance and demonstrations to assist teach course material and help students develop their programming skills. Marked exams and provided feedback. Subjects tutored were:
- Procedural Programming (March 2009 - June 2010)
- Interacting Systems (March-June 2010)
- Systems Administration (March-June 2013)
UTBox Systems Engineer
January 2011 - June 2012 (1 year 6 months), Sydney, NSW, Australia
Upgraded the infrastructure to ensure everything from the network stack, to database and webservers were highly available and managed through configuration management to ensure a failure would not result in a loss of revenue. Day to day operations were also to support clients and help with fixing bugs or developing new features for the product in the codebase.