SIEM in a Nutshell
SIEM has come a long way since it first came on the scene, about twenty years ago.
It began as a log management tool focused on simple collection and storage to meet compliance, and these use cases are still very relevant as 2020 draws nearer.
Initially, SIEM faced challenges to do with capacity and scale, but big data lake solutions such as Hadoop and Elastic solved the increasing requirements for millions and sometimes billions of event ingestions per day.
Supporting big data and a high volume of events introduced a new challenge to the security operation centers. Most SIEM solutions are overwhelmed by the quantity of data points and require a long and expensive “tuning” process before getting security value from all the incoming events.
Machine learning and User and Event Behavior Analytics (UEBA) are trying to reduce the number of rules and move the SIEM into the right direction. However, the reality is that security teams still spend too much time creating rules and complicated search patterns in order to find security breaches before the kill-chain is complete.
The Challenges of SIEM Today
Working with security teams and data sources over the years, I see three main challenges not sufficiently addressed by current SIEM solutions:
1) No standard SIEM terminology – vendors refer to similar threats coming from security devices differently. Not only are the fields normalization different, but also the risk level and type is usually an arbitrary definition depending on the SIEM vendor. Moving to the MITRE ATT&CK framework, or any other agreed terminology, in many cases becomes another task that needs to be managed and maintained by the security analyst and the SIEM administrator.
2) Overly noisy sources – while some security vendor alerts have high fidelity, most security events end up being noisy. In many cases, security devices send additional operational signals that have nothing to do with security. In other cases an alert may manifest itself in repeatable ways and, as a result, makes it hard to distinguish between a true and false positive. Since some security event types are noisier than others, filtering and aggregation rules is not one size fits all.
3) No single event tells the story – it is well known that one event coming from a security device is not necessarily equal to an actual security incident. But what type of SIEM events are? And when does an event become an alert? In many cases the answer is more an art than a science. Repeat alerts within a specific period of time is a good example to what security analyst consider an “interesting” condition. However, the story of actual risk could be a combination of multiple phishing attempts on an account followed by an identifiable piece of malware detected on an endpoint associated to the same account.
The Main Data Source Types for SIEM
In order to solve the aforementioned challenges without forcing security vendors to change their event logs, we need to identify the main data source types for SIEM. Security teams and IT operation use cases might be different, but some of the data sources used in today’s organizations can serve for both IT monitoring as well as threat detection.
- Cloud based solutions – SaaS and cloud-based infrastructure has become an integrated part of the organization’s network. In order to close the security gaps associated with activity in the cloud, we need to monitor logs from cloud based solutions such as AWS, Office365 and G-Suite. Some of the cloud providers allow API connection for retrieving events to the SIEM and others support syslog data streaming to the SIEM.
- Threat intelligence feeds – Known indicators of compromise (IOCs) and CTI feeds have become an essential source for SIEM to support early detection of known bad actors, web sites, command and controls IP addresses and malicious file hashes.
- Vulnerability assessment results – since SIEM has become an investigation and threat hunting tool, asset management systems and vulnerability scanning results are considered value meta data. Enrichment of entities with such meta data helps to search events associated with vulnerable assets or specific user groups.
At the end of the day, SIEM will turn into the single source of “truth” when organizations evaluate their security posture, existing security gaps and potential risks. SOC analysts and the incident response team will depend on the future SIEM when defining their response and mitigation procedures.
How Can SIEM Meet Today's Challenges?
In order for SIEM solutions to effectively confront today’s challenges, they should provide security classification, trends and anomaly detection as well as cause and effect analysis. Let’s discuss each of these in detail:
1) Classification – the first step in any data processing is normalization and aggregation. The second step should be classification. In order to speak the same language, threat fields should include pre-defined categories. The MITRE ATT&CK framework or the cyber kill-chain are two examples of containers for such classification. Regardless of the model, classification is an essential step before correlation.
2) Anomaly detection – once the threat is classified, we can look at classifcation trends over time. Specific threat signals coming from multiple sources will present a higher risk regardless of noisy logs and false positives. Trends and seasonality on classified threat during the weekend might have higher weight than similar alerts during the weekdays.
3) Cause and effect analysis – cause and effect adds the subject matter expert role into the SIEM analysis. Network scanning will not lead to intrusion without a successful compromise. If the SIEM can search for patterns that link steps of an intrusion it can highlight where we should look first. Using cause and effect can identify real fire and increase the risk of the entities involved with sequential tactics and techniques.
Detection based on the above methods will result in an easier to manage and more reliable SIEM. Natural language processing (NLP) can help with the classification of unstructured data sources. Vendor event signatures can be mapped to specific categories and basic rules can look at common risks across multiple sources.
As an example, clear text protocols such as telnet, FTP, POP3 and others could be easily filtering into one type of risk.
Cause and effect are a little trickier since they require a better understanding of the current threat landscape of the organization and the current attack vectors used in the wild.
As an example, a phishing attempt followed by lateral movement from the same host should generate a high-level security alert, since the cause and effect signals present a potential successful breach.
Some vendors are moving in this direction and some technologies allow implementing such rules. For example, Sigma is a generic signature format for SIEM systems. Sigma is for log files what Snort is for network traffic and YARA is for files. You can write your SIEM searches in Sigma to avoid a vendor lock-in.
At the end of the day, when detecting an incident, the security operation is interested in the entities involved, endpoint and/or user account. With reliable detection of entity involved in a breach, SOC and IR teams can quickly mitigate the risk by blocking, isolating or change of account permissions.
However, to identify the relevant entity the SIEM should go through filtering the noise and the uncorrelated events from the classified threats and correlated events.
In summary, SIEM may need a face-lift, but it isn't going anywhere. In a world where both the breaches and fines get bigger every year, SIEM systems should work better and faster in order to enable workflows and reduce security analysts’ detection and investigation time. Threat classification, trend analysis and automatic correlation based on subject matter expert knowledge will be the enabling technology of effective modern security operations centers.