Fifteen years ago, the cybersecurity industry was only reacting and responding accordingly. But today, it is imperative to stay one step ahead of threat actors by predicting their next move.
Big data and data science are common words in cybersecurity. Big data is a collection of large and often complex, semistructured, and unstructured data used in machine learning. It is true that data has intrinsic value, but it’s not useful until that value is actually discovered through analysis and the reality is, not all big data or data science is the same.
History of the cybersecurity landscape
Many people don’t realise that big data and data science go hand in hand, especially in cybersecurity. The cybersecurity landscape 15 years ago was considerably different to what it is now. Back then, new malware strains were uncommon and the amount was low and manageable. Even 10 years ago, when more sophisticated actors began to arise, only a tiny percentage of these represented advanced persistent threats (APT) or nation-states and governments looking to grab intelligence. Only a few governments were harnessing this ability, such as China, Russia, North Korea and later Iran.
But today, our world has massively changed. The threat landscape has significantly matured. For example, Iran has advanced its cyber-abilities and use of ransomware to blend disruptive operations with authentic eCrime activity, Russia and China have become even more dominant at the weaponization of vulnerabilities at scale to facilitate initial access efforts and other countries are learning and following, from the likes of Turkey and Vietnam to India. Every country now understands that their cybersecurity posture needs to have the capability for intelligence collection.
What has the current threat landscape taught us?
The threat landscape is becoming more blurred day by day. Research shows that 62% of the attacks are malware-free. That means attackers are using living-off-the-land techniques or file-less attack techniques, meaning that they are disguising themselves as an administrator or a normal user.
Ukraine, for example, has for years been bombarded by sophisticated cyberattacks from Russia, such as DriveSlayer, a destructive wiper malware targeting governments in Ukraine. This is also similar to a threat actor known as VOODOO BEAR, also known as the Main Directorate of the General Staff of the Armed Forces of the Russian Federation or simply as the GRU. Unfortunately, it is not just the various Russian nation-states targeting Ukraine but also extremely sophisticated and capable adversaries that are part of the eCrime underworld.
WIZARD SPIDER, also known as CONTI or Ryuk, have also taken to supporting the Russian Federation and are actively warning that they will target organisations, governments and any other groups directly targeting Russia with sanctions or other losses. We are continuing to see efforts by Russia-nexus adversaries against network infrastructure in Western countries. This activity has the potential to suggest preparations that could provide intelligence collection opportunities, or worst the potential to enable disruptive or destructive operations.
The importance of proactivity in cybersecurity
Fifteen years ago, the cybersecurity industry was only reacting and responding accordingly. But today, it is imperative to stay one step ahead of threat actors by predicting their next move. The most effective cybersecurity solutions can correctly predict adversary behaviour using a combination of two elements: data science and machine learning (ML) or AI. But, it is important to note that AI is useless without the right data points.
One of the most significant cybersecurity issues is understanding the difference between normal behaviour and bad adversary behaviour. In some technologies, false positives or even false negatives are acceptable, but in cybersecurity, this can result in alert fatigue and/or, worse yet, major breaches – costing organisations a fortune.
So, the threat is real. Where does this leave us?
Security data sits in many places. On endpoints, servers, it traverses the network, sits in the cloud, in containers, SaaS or PaaS platforms. It sits on our Active Directory or Cloud Directory services. Security data or telemetry is everywhere. To accurately identify attacks, all data and pieces from across the network are required.
This is where extended detection and response (XDR) comes into play. These cybersecurity solutions harness data from all across the network, provide further context to incidents, correlate what usually would be isolated data sets and bring up to the top the incidents that may have been missed in isolation.
The role of big data and data science
The only way of minimising false positives and false negatives is by using a vast amount of data to train the AI. The most effective cybersecurity solutions on the market use a single graph data store that collects over 1 trillion events every single day.
Data today is collected not only from endpoints but also from the cloud, threat intelligence, and third-party data. This data is then used to identify bad threat actors and train the hundreds of machine learning models used to predict attacks and identify new unknown attacks.
The final piece of the puzzle and what separates good from great cybersecurity solutions is making sure there is a human element. While AIs excel at solvable problems and shuffling huge amounts of data, they rarely have those ‘a-ha’ moments of creative invention that create new possibilities. This means that specialised threat hunting teams will also detect hidden attacks and new techniques that may have been missed during the automated process. Cybersecurity experts can then continuously tune, feed, alter and verify, making sure the model, with every event, gets stronger and stronger, minimising false positives and false negatives.
Asking the right questions
Luckily, cybersecurity solutions have progressed significantly in the last 15 years. Cybersecurity technology has incorporated the benefits of both big data and data science. But, organisations that want to ensure their enterprise is protected still need to make sure they are asking the right questions. Some vendors may use buzzwords like AI or big data, but it is crucial to ask what it actually means and whether it will effectively protect your organisation.