Redazione RHC : 15 September 2025 19:52
CrowdStrike today introduced, in collaboration with Meta, a new benchmark suite – CyberSOCEval – to evaluate the performance of artificial intelligence systems. in real security operations. Based on Meta’s CyberSecEval framework and CrowdStrike’s leading expertise in threat intelligence and AI data for cybersecurity, this open-source benchmark suite helps establish a new framework for testing, selecting, and leveraging large language models (LLM) in the Security Operations Center (SOC).
Cyber defenders face a huge challengedue to the influx of security alerts and ever-evolving threats. To outperform adversaries, organizations must adopt the latest artificial intelligence technologies. Many security teams are still in the early stages of their AI journey, particularly using LLM to automate tasks and increase efficiency in security operations. Without clear benchmarks, it’s difficult to know which systems, use cases, and performance standards offer a true AI advantage against real-world attacks.
Meta and CrowdStrike address this challenge by introducing CyberSOCEval, a suite of benchmarks that help define the effectiveness of AI for cyber defense. Built on Meta’s open source CyberSecEval framework and CrowdStrike’s frontline threat intelligence, CyberSOCEval evaluates LLMs in critical security workflows such as incident response, malware analysis, and threat analytics insights.
By testing the capability of AI systems against a combination of real-world attack techniques and expert-designed security reasoning scenarios based on observed adversary tactics, organizations can validate performance under pressure and demonstrate operational readiness. With these benchmarks, security teams can pinpoint where AI delivers the most value, while model developers gain a North Star for improving capabilities that increase ROI and SOC effectiveness.
“At Meta, we are committed to promoting and maximizing the benefits of open source AI, especially as large language models become powerful tools for organizations of all sizes,” said Vincent Gonguet, Product Director, GenAI at Super Intelligence Labs at Meta. “Our collaboration with CrowdStrike introduces a new suite of open-source benchmarks to evaluate the capabilities of LLMs in real-world security scenarios. With these benchmarks in place and open to continuous improvement by the security and AI community, we can work more quickly as an industry to unlock the potential of AI in protecting against advanced attacks, including AI-based threats.”
The CyberSOCEval open-source benchmark suite is now available to the AI and security community, which can use it to evaluate model capabilities. To access the benchmarks, visit il CyberSecEval framework by Meta. For more information on benchmarks, visit here .