If you've been following any of my work for the past couple years, you know I've become a huge advocate of cloud computing and DevOps. Not because I've been caught up in any sort of hype machine, but because I've been spending the past five years or so working with them hands-on and helping organizations as they transition to cloud.
The deeper I delved into cloud the more I started to adopt DevOps practices. Pretty quickly on your journey to cloud you realize you need to rely on automation and continuous delivery to simply manage your environments. The clincher for me was the time I presented some security automation techniques I developed for some lab work at the Black Hat conference, and afterwards some attendees from some big orgs that were making heavy use of cloud came up to tell me they had been working on the exact same issues.
For all of us it was actually more of an ops problem than a security one, but one with some pretty powerful emergent security benefits (in that case, to inject security policy automatically into servers).
Okay, enough talking about things in generic terms. Here is a specific example of how development, operations, and security can overlap and result in some hefty security improvements that wipe out multiple categories of issues.
One key concept in DevOps is that of *immutable* infrastructure. This can be as simple as virtual servers (instances), or as complex as entire infrastructure stacks. Here is how it works and why we use it.
A simple example is when you use autoscaling in a cloud provider. You have a standard image of a server, and when you need more capacity the cloud service starts new instances behind a load balancer. When you don't need that much capacity anymore (based on preset rules) the cloud service shuts down instances. This is exactly how *elasticity* in the cloud works.
Developers and operations further leverage this for both server and code updates. A nifty feature is you can build new versions of your servers and run all sorts of tests, and when you are happy with the results you just swap them into the autoscale groups. Your updated servers replace the running servers and you just pushed a rolling update live into production. If something breaks, you can always change the rule to drop back to the old version. For those of you who are container-curious, yes, you can do the same thing with Docker and other container technologies.
Now for this to work, you have to follow some rules. One of the most important being you never log into a running instance to make changes. Why? Because those changes won't ever apply to any other running instance. Plus, the one you "fixed" might get blown away at any moment depending on the current demands on your application stack. All changes happen at the start of the pipeline, run through automated testing, and only then are signed off to go into production.
This leads to some very interesting security implications. Patching a running server 9and hoping the patch sticks) goes away. You just roll out new ones. Remote logins? A thing of the past. Most organizations I work with in these kinds of deployments completely disable SSH or any other remote access, since that's a surefire way to let an admin drop in a fix that never gets applied to the rest of the environment. (Some organizations allow highly restricted remote access only for troubleshooting, under very limited conditions).
And since no one is ever allowed to change a running server, host/file integrity monitoring becomes a rock solid security control, and auditors might even sign off on using it instead of antivirus. On the off chance you *do* detect a change, you can simply blow away the instance and a new, pristine one will replace it.
No live patching. No remote logins. No antivirus needed (maybe). Any change, at all, to a running server easily detectable and indicative of an attack.
These are called *immutable servers* because you never ever change a running instance. You simply replace them. When you think about that in the context of security, I've only scratched the surface of the advantages. When you learn the same principles work for entire application and infrastructure stacks? Well, the world gets quite interesting.