SoC Modernization: Where are you on the Evolutionary Journey? And how do you compare to your peers?

Many organizations today will tell you they have a next-generation Security Operations Centre (SoC). In fact, you can find a myriad of thought leadership pieces exploring how businesses are evolving their security operations, with many looking towards AI as the answer.

One key question however is should we continue to optimize (build out the stack further), or do we actually need to evolve and change some of the fundamental principles of the next generation SoC? For which there are some interesting insights in a paper from Deloitte/Google.

Others talk about SoC 2.0, yet if you look at ISACA’s model, we should be at the NOC/SOC 6.0 stage or higher, with key focal areas including big data lakes, the introduction of Large Language Models (LLMs) and many now choosing to run hybrid-managed SoCs.

To really move your SoC forward, you should consider these key aspects:

    1. Look back and make sure your goals are still right for your business.

    2. Validate where you are today against both current and future goals.

    3. Define a measurable time-bound path with clear milestones on where you need to be.

Earlier this year, Cybereason commissioned some unique research into security operations to truly understand where businesses are today. We surveyed over 1,200 companies (with at least 500+ employees) spanning across SoC practitioners through to management, across USA, UK, France, Germany, UAE & Saudi Arabia. The results are both insightful to benchmark yourself against your peers, and please do note that there were variances both in countries and industries.

The outcome every business surely aims for is effective operational cyber resilience. For example, all incidents are either prevented or identified and resolved in a timescale and manner that ensures minimal impact to core business processes. So where are we in achieving that journey today?

Outcomes

Let's start with the positives; The average MTTR (Mean Time To Resolution) is 2 - 4 hours, and in the majority of the instances, they hit this Service Level Objective (SLO). Most respondents have 24x7x365 coverage, either through their own capabilities or by leveraging 3rd party services. Yet to counterbalance this, many organizations are still far away from processing all of their alerts on the same day. In fact, we found the majority of our respondents process a variation of 50-80% of their alerts daily. Likewise, the scope of the quality of what's being triaged varies greatly with 10% having a whopping 90% of alerts identified as false positives and the average spanning between 20-40%.

The demands on every SoC will always continue to grow, so if we are looking to modernize, it’s imperative that these latter two metrics (did we process everything to the SLO’s and is the quality of what we process good enough) become areas of focus.

In my experience, we focus too much on the outcome-based metrics such as MTTD & MTTR, and not enough on the journey or process metrics. For instance, what an acceptable degree of false positives is and how they can be reduced. If we can get the right controls around the journey of incident identification, triage, and response, then the outcome metrics should be more organically achieved. So let's look at the journey that starts first with the data, then the processes applied against it and finally, what tools should be used to achieve these.

Not all incidents are the same, some are relatively simple and others very complex. As such, when looking to improve, it's worth considering where in the business you are looking to improve. For example, one of the key challenges in our industry is skills shortage. By better accelerating and using automation for more simple incidents, you can leverage human effort on the more complex and time consuming threats. For some businesses, the answer is in fact simple - outsource the aspects you can’t scale, whether that’s dealing with the volume or complexity of the incident.

Data

On average, security events are kept between 1-6 months, and surprisingly only 1% of respondents were keeping it for more than a year. Compare this to our most recent ransomware business impact research where we saw that over half of all incidents were not discovered for 3-12 months - there is clearly a data gap. If you haven’t retained the data long enough, how can you go back and understand what has occurred? With the growing regulatory pressures, you would expect to see more companies keep their security data longer. Yet 96% were looking to reduce what gets ingested into their SIEM (traditionally the data aggregation point) to reduce cyber security costs.

At the same time, the data gathered is causing strong concerns, as over 75% of organizations believe their lack of data impacts on their ability to do their job. As well as the retention issue, they also flagged issues of data fragmentation. The majority have data split between 2 or more data lakes, and 67% only keep the alerts, not the raw data, which creates real challenges when trying to both validate potential false positives but is also key for richer threat hunting. The alert is simply the first evidence breadcrumb and the raw data contains the richer evidence.

Processes

Building out the evidence trail is time consuming for SoC analysts, with our research showing it can take up to 75% of their time. Typically they will be looking at threat intelligence sources for clues as to what other evidence breadcrumbs should be in their captured data. Verifying this has traditionally been done via the SIEM. We found that it typically takes 2 to 3 SIEM queries to triage an alert, and on average each query takes between 2-10 minutes to deliver the results. In some instances, this can be much slower if businesses are not using cloud compute for scale and speed or if the data sets are much larger. Our research also found that writing a query from scratch can take anywhere between 1 to a whopping 24 hours depending on its complexity and the skills of the analyst. So, with every alert being triaged, the SoC analyst will be keeping their fingers crossed that there is a query already written that they can just adapt.

Learning how to write complex queries is a real skill that takes years of experience, and today on average 16 - 30% of headcount in SoC’s are unfilled, leaving organizations looking for smarter ways of joining together threat artifacts to see the whole malware operation. AI allows for much smarter data analytics, but it requires you to not only have the data, but that the data also has context, and is structured in a way that is easily machine readable, so AI can understand how to process it - for example tagging it against the MITRE ATT&CK framework.  

The research also flagged that today, too much of the data is fragmented into different data lakes and is typically in multiple native data formats based on which source it came from, the capabilities generating the alerts and the capabilities actually used to process them!

Capabilities

Today organizations are using a myriad of disparate security products, a list that will only get longer as threats continue to get more complex. The ability for the SoC to join these breadcrumbs from all of these point solutions into a malicious operation hypothesis is increasingly complex. It requires more sources of evidence, intel and subsequent SIEM queries required to aggregate/understand what's occurring, and more tools such as SOAR to attempt to automate the increasing number of steps required to complete the detection and recovery process. Yet as the complexity grows, typically the time allowed to detect and respond is becoming ever shorter as businesses become more digitally dependent.    

As we continue to add in more tools, the more data there is, so it’s no wonder so many organizations are continually looking to optimize and modernize. Typically, organizations are using 5-6 different key tools such as SOAR, TIM, EDR and SIEM to work through the process of translating the data breadcrumbs into something more. All too often these are from different vendors using different data constructs, making the processes more convoluted as further translations are required at each step. Take threat intelligence as an example: on average, each organization is using 3 different threat intel feeds, each requiring their own processes to enable the analyst to build out potential insights into what other evidence breadcrumbs they need to gather. It’s no surprise that this requires console hopping with analysts moving back and forth between 3 or more differing key consoles to triage any incident.

Summary

It would be easy to suggest we still have a way to go, and in reality, we will always be modernizing SoC capabilities and processes as both IT and the threatscape continue to evolve. 

What is increasingly evident, however, is that the SoC has to become the cybersecurity data science center of excellence. Just as we are seeing across the broader IT industry, the key is having the ability to manipulate data at pace using machine learning and generative AI capabilities to get to outcomes at speed across extremely large data sets.

In order to achieve this businesses need to not only look at what else they need to add to their SoC to become more capable and efficient, but also what should be taken away and whether the foundations their SoC is built upon including the key technologies and processes are still fit for purpose both today and in the future.

Greg Day
About the Author

Greg Day

Greg Day is a Vice President and Global Field CISO for Cybereason in EMEA. Prior to joining Cybereason, Greg held CSO and CTO positions with Palo Alto Networks, FireEye and Symantec. A respected thought leader and long-time advocate for stronger, more proactive cybersecurity, Greg has helped many law enforcement agencies improve detection of cybercriminal behavior. In addition, he previously taught malware forensics to agencies around the world and has worked in advisory capacities for the Council of Europe on cybercrime and the UK National Crime Agency. He currently serves on the Europol cyber security industry advisory board.