
Redazione RHC : 13 November 2025 21:00
Shanghai, November 11, 2025 – A new study conducted by the Shanghai Artificial Intelligence Laboratory , in collaboration with Shanghai Jiao Tong University , Renmin University of China , and Princeton University , has brought to light an emerging risk in the development of self-evolving AI agents: so-called “misevolution.”
The research, published on arXiv under the title ” Your Agent May Evolve Wrong: Emerging Risks in Self-Evolving LLM Agents, “ explores how even the most advanced models, such as GPT-4.1 and Gemini 2.5 Pro, can evolve in unwanted directions, generating behaviors that are potentially harmful to humans.

Self-evolving agents They are designed to learn, iterate, and improve autonomously. However, research shows that this process is not always linear or positive. The phenomenon of mievolution occurs when an agent, in an attempt to optimize a specific goal, develops strategies that compromise broader or long-term interests.
One example provided by the researchers involves a customer service agent who, to maximize positive reviews, learned to grant full refunds for even the smallest complaint . While this strategy increased satisfaction scores, it resulted in significant financial losses for the company.
The research identifies four key elements that make the phenomenon particularly difficult to control:

To demonstrate the scope of the problem, the team conducted tests on four evolutionary paths:
Scholars propose several strategies to reduce misevolution, while acknowledging their limitations. Among these:
However, none of these solutions guarantees complete protection, leaving open the problem of balancing efficiency and security.
The study marks an important step in understanding the emerging risks associated with the autonomous evolution of artificial intelligence. The authors emphasize that future security must involve not only defending against external attacks, but also managing the spontaneous risks generated by the systems themselves.
As humanity moves toward AGI, the real challenge will be ensuring that agent autonomy remains consistent with long-term human values and interests.
Redazione