To sustain the fight against a decentralized global enemy in cyberspace, the modern Security Operations Center (SOC) must engage in a change management experiment to become more agile and reimagine the tools and processes at its disposal.
The status quo of constantly battling to stay ahead of adversaries, remain relevant to business unit peers, show a return on security investments, and ensure that hard-to-come-by staff is not overworked to the point of burnout is untenable.
Today’s SOC finds itself in the same position the Pentagon was in on the morning of September 11, 2001 — under-staffed, lacking specific skills, slow-moving, and hampered by industrial processes that are incapable of keeping pace with new adversaries waging a new type of warfare.
In 2001, our military not only lacked the organizational structures necessary to fight the War on Terrorism but our defenders were hampered by processes and tools that were designed for wars of the past. It would take several years of intense change management to attain the proper force structure with the right tools and workflows.
Samuel Liles is the Vice President of Security at Ultimate Kronos Group (UKG) and, from 2015 to 2018, served as the Department of Homeland Security’s acting director of the Office of Intelligence and Analysis Cyber Division. In a recent interview, Liles said traditional SOC structures and workflows must be reconsidered.
“Pick up any textbook on how to build a SOC and look at how they structure the workflow. It's an industrial process with an input and an output,” Liles said. “I don't look at SOC efficiency and effectiveness just from the aspect of inputs and outputs in an industrial process. I look at it as, how is the SOC enhancing security? If the outputs are not aligned to increasing the company's security, then you're not being effective. There are some things like phishing that will continue to be a constant churn and burn challenge. But for things we can prevent, the result has to be that those things do not happen again.”
As a former intelligence officer in the United States Marine Corps, I understand the importance of being able to move, shoot, and communicate seamlessly in chaotic environments. All elite fighting organizations understand how to do these three things under tremendous stress and with less than ideal staffing, equipment, and resources.
The military leverages something called the OODA loop (Observe, Orient, Decide, Act), a four-step approach to decision-making that emphasizes collecting available information, putting it in context and quickly making the most appropriate decision while being prepared to make changes as more data becomes available.
The SOC must adopt its own version of this operating model to manage alerts, deal with security tool sprawl, and manage telemetry.
“The false positive rate drives analysts into the dust,” Liles said. “So what you end up with is tiers of analysts working to close tickets as fast as possible. We call them SOC analysts, but they're not actually doing any analysis. They're looking for patterns they've been trained to see, which means that if a new pattern comes in, they may or may not hit on it,” he said.
It’s simple: Alert fatigue causes analysts to miss things. “I've talked to multiple teams across different industries, and they all do the same thing, which is they've tuned out the actual attack,” Liles said. “That means you're always chasing an adversary with your alerts. You're always following you're never leading.”
Recent surveys show that the rapid expansion of the attack surface has led most organizations to increase the number of security tools. While the average mid to large enterprise has dozens of security tools, most lack integration, require specialized knowledge to operate, and often go unused.
But there’s another major problem: Lack of context. Most organizations leverage a Security Information and Event Management (SIEM) platform to access the data coming in from their security tools. But what they get, said Liles, is “entropy in the system.”
“Let's say you're running GCP (Google Cloud Platform), and you have Security Command Center installed. Well, Security Command Center is a context-available SIEM that lets you know all the relevant details about your cloud environment. So if you take that and then just feed the alerts over to something like Splunk, another SIEM, you have SIEM loss,” he said. “I don’t know what else to call it. It's entropy in the system. So now you end up with analysts rotating through a series of dashboards, constantly looking for the next ticket, which often is another version of entropy.”
When analysts are forced to switch contexts (moving from different tool dashboards and screens to access data), it adds to lost effort. “You can amass an amount of lost effort in a large SOC equal to multiple full-time employees just in context switching,” Liles said. “So if you need more people in your SOC, do less context switching.”
For far too long, too many people have failed to recognize that cybersecurity is a data problem, Liles said. “People have been trying to make it an inventory problem. It's not an inventory problem. It's a data problem. People have tried to make it a systems problem. It's not a systems problem, it's a data problem.
We're trying to make it a cloud problem. It's not a cloud problem. It's a data problem, and understanding where your data flows and how it moves is how you secure it,” he said.
And if it's a data problem, that means collecting telemetry — all of the available telemetry. The problem is that not every solution provider has the capability to collect and process all the telemetry that’s coming from an organization’s infrastructure.
Those vendors commonly resort to "data filtering" where they eliminate telemetry before they send collected data to the cloud for analysis. This data could help return a timely detection, so if it’s not considered, that will only yield an incomplete snapshot of an organization’s security posture.
But for many organizations, getting a handle on their security telemetry is challenging. “A lot of organizations that I talk to are not prepared to have that conversation,” Liles said. “They're just not. People throw away data, like external firewall logs, all the time. They just throw it away.”
Moving from an alert-centric security model to an operation-centric model significantly improves SOC team operational effectiveness and efficiency. Small teams can do the work of larger teams, less experienced teams are immediately more effective, and your SOC’s ability to mitigate risk improves exponentially.
Cybereason uses artificial intelligence and machine learning to build a comprehensive picture of the attack story using all available telemetry—no filtering. When Cybereason detects malicious activity and presents that detection to an analyst, it’s a high fidelity alert.
Analysts are only brought in to the triage, investigation, and response workflows when a verified alert is active in an environment. This creates massive savings in investigation time by avoiding repeat issues and only fielding true positive detections that warrant human eyes on a screen.
Cybereason’s primary differentiator is the ability to consolidate alerts into a single malicious operation — what Cybereason calls a MalOp™. Whereas other vendors alert dozens of times for a single intrusion, the Cybereason MalOp Detection Engine stitches together the separate components of an attack, including all users, devices, identities, and network connections into a comprehensive, contextualized attack story.
Because the Cybereason Defense Platform understands the full attack story, it can orchestrate and automate response to all impacted endpoints and users through tailored response playbooks without the need for an outside SOAR solution.
When a team has more bandwidth, this creates extra cycles. An operation-centric approach means that the additional bandwidth created can be used on projects that were out of reach before (like threat hunting), and teams can finally get ahead of the curve.