Redazione RHC : 22 September 2025 15:52
SentinelLABS researchers have discovered what they describe as the first known example of malware with integrated LLM functionality , dubbed MalTerminal . The discovery was presented at LABScon 2025 , where a wide array of artifacts was displayed: a Windows binary, several Python scripts, and auxiliary tools demonstrating how GPT-4 has been exploited to dynamically generate malicious code , such as ransomware or reverse shells.
The analyzed sample contained an API endpoint referring to the old OpenAI Chat Completions service, which was decommissioned in November 2023. This suggests that MalTerminal was developed before that date, making it an early malware sample with embedded LLM. Unlike traditional malware, some of its logic is not precompiled, but is created at execution time via GPT-4 queries: the operator can choose between “encryptor” or “reverse shell” modes, and the model generates the corresponding code on the fly.
Inside the kit, the researchers also found scripts that replicated the behavior of the binary, as well as an LLM-based security scanner , capable of evaluating suspicious Python files and producing reports: a clear example of the dual use of generative models, applicable to both offensive and defensive purposes.
The authors also demonstrated a novel methodology for detecting LLM malware, based on unavoidable integration artifacts : embedded API keys and hardcoded prompts. By analyzing key prefixes (e.g., sk-ant-api03 ) and recognizable fragments related to OpenAI, they developed effective rules for large-scale backtracking. A year-long analysis on VirusTotal revealed thousands of files containing keys , ranging from accidental developer leaks to malicious samples. In parallel, they tested a prompt-based search technique: extracting text strings from binary files and assessing their intent using lightweight LLM classification , which proved highly effective at detecting previously invisible tools.
The study highlights a crucial paradox: using an external template offers attackers flexibility and adaptability, but it also introduces vulnerabilities . Without valid API keys or stored prompts, malware loses much of its effectiveness. This opens up new defensive avenues, such as searching for “prompts as code” and embedded keys, especially in the early stages of these threats’ evolution.
To date, there is no evidence of MalTerminal being widely deployed: it could be a proof-of-concept or a red team tool. However, the technique itself represents a paradigm shift, impacting signatures, traffic analysis, and attack attribution .
SentinelLABS recommends paying greater attention to analyzing applications and repositories: in addition to bytecodes and strings, it is now essential to look for textual traces, message structures, and artifacts related to cloud models , where the mechanisms of next-generation malware could be hidden.
The authors conclude by emphasizing that the integration of command generators and runtime logic weakens traditional detectors and significantly complicates attack attribution, opening a new chapter in the fight between cyber defense and cyber crime.