It’s hard to ignore that machine learning has taken over much of the discourse in today’s technology circles. It wasn’t that long ago that you needed IBM’s Deep Blue to beat a human at chess. Now there are few games that a computer has not bested humans at. Whether it’s Google’s AlphaGo and Go, Elon Musk’s OpenAI and DOTA, or Libratus built by a pair of researchers to beat the best at no-limit Texas hold’em, machine learning is invading every technology space at a frenetic pace.
Furthering the machine learning excitement are the dropping prices and smaller form factors. With most cloud providers having some type of machine learning platform available to developers it shouldn’t surprise anyone that machine learning is in a place today that the cloud was in a few years ago. Every industry seems to want a piece of it — the security industry included.
There are already many machine learning projects being built to enhance security, and as modern malware becomes increasingly sophisticated, it seems the advancements in this space can’t come quickly enough. With close to one million new samples being added to VirusTotal (one of the largest open datastores for malicious software) every 24 hours, gone are the days when you could simply blacklist your way to security. There simply are not enough hours in the day nor researchers available to reverse engineer every piece of malware, and at these rates blacklisting can leave you a day late and a dollar short.
New approaches utilizing machine learning are needed simply to keep up with the volume. We are still in the early days of applying those approaches to security, but what has already become clear is that a one-size-fits-all approach remains elusive. For machine learning to truly advance endpoint security to the next level, crucial improvements need to be made around its ability to be adaptive not only to threats, but to the organizations it’s protecting.
To effectively explain why, I am going to fall back on an age-old security analogy that occasionally gets eye rolls, but bear with me: Security is like an onion. The most effective approach is a layered one. We most often hear that analogy applied to the need for numerous defenses to address and serve as fail-safes for numerous threats (defense in depth). But it is important to realize that an organization is also itself an onion, consisting of multiple layers and complex relationships both personal and technological. It is because of these complex relationships that organizations require deep learning systems capable of understanding the unique set of behaviors that should and can exist in any one organization.
On the surface, it seems easy to articulate that while there are many similarities between organizations, the systems that comprise them can be uniquely different. Identifying what behaviors are acceptable for one organization but represent malicious activity for another, however, is not always intuitively obvious. As you peel back the layers, each level of network programs, processes, and activity poses a different set of behaviors that need to be understood, and context is always key. The activity and applications running on one server are different than any other (unless they are exact clones). Different servers on different network segments are inherently distinctive. The applications run by someone in business operations is different from someone in application development. The platforms themselves are different whether they are windows, Linux, MacOS, IOS, or Android, not to mention SaaS, PaaS, and so on.
All of this complexity leads to the inevitable discovery that what represents normal behavior in one environment can be abnormal behavior in another. Machine learning cannot work effectively without understanding and taking into account these differences. There is more work that needs to be done then to advance machine learning used in endpoint protection beyond simple binary classification between good and bad software at the binary level. An understanding of both operational context and observed behaviors is going to be required to continue to stay ahead of adversaries.
I for one am excited by the level of innovation happening within machine learning. While there is plenty of evolution to come, the early success stories are encouraging.