Using Machine Learning to Reduce the Alert Fatigue

After working as an information security practitioner within a Security Operations Center (SOC), I know from firsthand experience that false-positives are a common source of frustration within the profession. As production environments grow and expand to remain in alignment with developing business needs, the number of alerts generated by security tooling is naturally expected to increase over time (Alleyne, 2024). In an effort to combat alert fatigue, Nik Alleyne provided practical solutions for security practitioners to reduce alert frequency in his webinar, Using Machine Learning to Reduce the Alert Fatigue. While some information security professionals believe that attention and care should be prioritized to address true-positive and false-positive alerts, Alleyne argued that identifying true-negative and false-negative alerts provides greater value to security practitioners. In particular, Alleyne asserted that an equilibrium between true-negative and false-negative alerting exists.

Reference Slide 1

Applying Logistic Regression to SIEM Data

Note. The labels "TN", "FP", "FN", and "TP" are used to designate true-negative, false-positive, false-negative, and true-positive.

As true-negative alerts are correctly identified, documented, and removed through appropriate SIEM tuning, the value from two practical consequences becomes immediately apparent. The first consequence is that alert fatigue will decrease, as recurring true-negative alerts can safely be removed from visibility to ensure security practitioners remain focused on active concerns to the environment (Alleyne). The second consequence is that concerns from false-negative alerts will decrease in reciprocity, as the equilibrium between true-negative and false-negative alerting can be used to quantify the unknown (Alleyne). In this manner, false-positive alerts can be reframed as a blessing in disguise to information security professionals, as a blessing of awareness is granted at the cost of time, resources, and patience in resolving and remediating nonexistent problems (Alleyne). In most business environments, the practice of information security provides inherent value by ensuring the reliability and predictability of daily operational activities (McCabe & Witte, 2021, p. 28). Under this framework of thinking, my professional opinion towards false-positives have changed from a burden to a measure of confidence—as resolving security alerts ensures production operation.

Reference Slide 2

Optimizing the Model Built from SIEM Data

To provide a practical solution for security practitioners to reduce alert fatigue, Alleyne recommended for practitioners to first review the security architecture in their environments. While a SIEM solution should still be used to generate alerts for monitored environments, a SOAR solution should be overlaid to reduce administrative overhead for security analysts (Alleyne). Once these two architectural elements are in place, security practitioners are then able to collect, analyze, and clean data from security tools and case management solutions to generate predictive models (Alleyne). In order to generate effective models for machine learning, Alleyne recommended and explored several data science practices such as label consolidation, text tokenization, and data vectorization to create, aggregate, and average multiple predictive models together. If the distribution of data was found to be imbalanced by a security practitioner, Alleyne recommended applying the Synthetic Minority Oversampling Technique (SMOTE) to balance data, as imbalanced data is inherently inaccurate and any efforts to duplicate or discard data in search of balance would not accurately reflect best practice. Once multiple predictive models have been created, aggregated, and averaged together, quantitative variables such as precision, recall, and the f1-score can be used as metrics to measure efficacy (Alleyne). Success is achieved with the predictive model when the precision and recall values are drawn from balanced data sources and pose an acceptable risk to the enterprise (Alleyne).

Reference Slide 3

The Flow of Information from SIEM to SOAR

Resources cited:

Alleyne, N. (2024, November 4). Using Machine Learning to Reduce the Alert Fatigue [Webinar]. SANS.

https://www.sans.org/webcasts/using-machine-learning-reduce-alert-fatigue-nov-2024/

McCabe, E. & Witte, G. (2021). CISM Review Manual (16th ed.). ISACA.

Page updated

Google Sites

Report abuse