‘Control Tower’ Your Network
A while back I promised to start disseminating some thoughts about tech ops. As the dev’t side has many schools of thoughts (Agile, Scrum, CMM, etc), I thought that tech ops deserved its own philisophies.
So here’s my first:
“Thou shalt not touch the application servers, especially developers” – Everything should be centralized to controller services, i.e. you should control tower your network.
You can imagine the chaos if thousands of airlines & airport operators operated like the internet, constantly broadcasting “here I am”, and their location + ‘trying to land now’ to all their peers. Large complex systems often fight between centralized & distributed control structures. Sometimes, such as in the world of airports, a centralized structure is obvious.
Similarly in the world of tech ops for a high-availability system, I’d recommend a centralized control paradigm – there should be no need to ever touch an application server. All of the functions for which you would consider logging onto a server should be centralized elsewhere into your own ‘control tower’. For instance, all of the below should be centralized:
1. Use a centralized authentication server, never setup users on a server – use LDAP
2. Have a centralized logging server – use syslogng + some viewer (splunk, chainsaw, etc)
3. Have only a single point of config mgmt – use Puppet, Chef, your own scripts that check SVNs or something else, but have one place to change configuration on all the servers, don’t logon to change your connection pool size.
4. Deploy new builds either from your config mgmt system or using a centralized deployment system, don’t log into each node to deploy.
5. Monitoring should send metrics back to a single central monitoring server with alerts, graphs and all the data goodness you need.
When you first start designing your network, you must include this in your layout. Sacrifice the time and effort to do this up front, it will save you a lot of time down the road (trust me, I learnt this the hard way).
For security purposes run your LDAP servers and centralized logging server on distinct servers. If someone hacks your LDAP server or an application server they can try to alter logs to cover their tracks, at least make them go one step further and find your centralized logging system.
In terms of centralized configuration management your choices are basically around levels of automation.
On the more manual side, an interesting solution I have seen is to point apt-get to your own bridged distro server. Run apt-get upgrade or install to get new packages on the app servers (from a centralized script of course!)
The more common automated solutions I have seen are Puppet or SVN. Both provide version management, if some changes have been made, files are updated accordingly.
A control tower paradigm will provide several benefits:
-Far more security and control as you’ll be able to lock out most users from your production servers.
-Less maintenance related outages and higher uptime. Sometimes you are your own worst enemy when it comes to uptime
-Maintenance windows will have less manual human work, leaving less room for errors & more time for more planned works
-Traceability, finding problems & troubleshooting will be significantly faster as you’ll be able to scan through a single point of configuration and a single log pool. Rambling through 40 servers to find one incorrect setting is a nightmare
The last great point of implementing the control tower is that developers can buzz the control tower Maverick-style to piss off the ops folks (i.e. break all the rules & deploy stuff without permission!).

“What were you developers doing out there on the network?”
“We were communicating with the ops team… you know… sign language… and we did our deployment while being upside down at mach 5… yeeeeeehaw”









