Session

(Un-)Natural Language Processing: Defensive AI in Practice

Language models are currently near the peak of the hype curve. Their application to cybersecurity has been a topic of academic research for a while. In this talk we present the results of our efforts to put one of the many proposed architectures into production. We explain how and where AI can fit into security systems and detail the approach we took. We also elaborate on the problems we faced and detail why there is a big gap between what researchers put out and what is feasible and useful in practice.


AI Summary

Disclaimer: This session information was generated with the help of AI. The information has been reviewed and refined by the Swiss Cyber Storm team and the speaker before publishing.
Emanuel Seemann presents research on using AI for defense, specifically through unnatural language processing to improve intrusion prevention systems and web application firewalls. The talk covers the development and testing of AI models that can automatically adapt to new attacks by analyzing abnormal patterns in web traffic. Seemann discusses the challenges of training these models with quality data and the trade-offs between model performance and operational efficiency.

Key facts

  • The project began as an anomaly detection initiative but evolved into a classification problem due to the practical challenges of anomaly detection in real-time systems.
  • The effectiveness of the AI models was significantly improved by using real customer data instead of the synthetic or limited data sets typically available to researchers.
  • The trade-off between model performance and operational efficiency is a critical consideration in the deployment of AI in cybersecurity.

Ideas

  • Using AI to automatically adapt to new attacks by identifying abnormal patterns in web traffic represents a significant advancement in defensive cybersecurity.
  • The quality of data used for training AI models greatly impacts their effectiveness. Access to real, diverse customer data can improve model performance significantly.
  • There is a trade-off between the performance of AI models in cybersecurity (in terms of accuracy and speed) and the operational efficiency of the systems they protect.

Keywords

  • Defensive AI
  • Intrusion Prevention
  • Natural Language Processing
  • Model Training
  • Operational Efficiency

Quotes

  • “if we had a monkey that would read all your logs it could detect attacks by simply reporting anything that looks abnormal”
  • “the researchers have pretty good ideas... they just have access to really bad data”
  • “the feasibility of running big ml models is still an open question”

Recommendations

  • Organizations should consider providing quality data to researchers to improve the effectiveness of defensive AI technologies.
  • Security teams should evaluate the trade-offs between AI model performance and operational efficiency to find an optimal balance for their specific needs.

About the speaker

Emanuel Seemann

Emanuel Seemann

Security Researcher, CrowdSec
Emanuel Seemann is a security researcher working for CrowdSec, a French cybersecurity startup. He holds a degree in mathematics and has been hacking around in software starting from a young age. In his current role he is tasked with bringing AI into defensive security, by providing both new methods of intrusion detection and improving alert enrichment to help security personnel in their tasks.
Read more …
Copyright © 2025
 
Swiss Cyber Storm
Hosting graciously provided for free by Nine