News Security

Cloudforce One Warns of Rising Attacks Targeting AI Reasoning Systems

Cloudflare

Cloudflare’s threat intelligence team, Cloudforce One, has released new research revealing how cyber attackers are increasingly manipulating AI reasoning systems through advanced adversarial deception techniques.

The report, titled Adversarial Deception: A Study of Indirect Prompt Code Injection, examined seven leading AI models to understand how their reasoning capabilities can be bypassed by malicious actors. The findings indicate that attackers are shifting focus from traditional network vulnerabilities to exploiting the decision-making processes of large language models (LLMs).

“Attackers are now targeting AI reasoning itself, not just traditional security controls.” 

— Cloudforce One Research

According to the study, attackers are using deceptive “lures” strategically inserted text designed to emotionally manipulate or confuse AI systems to trick automated security auditors into approving malicious code. Researchers found that subtle deception was often the most effective tactic.

One of the report’s key findings highlighted the “1% bypass zone,” where safety-related comments making up less than 1% of a code file reduced AI detection rates to nearly 53%. Researchers also identified a “context trap,” where malicious payloads hidden within large software packages or library bundles caused detection accuracy to drop to as low as 12%.

“The attack surface has expanded beyond the network to the model’s reasoning.” — Cloudforce One

The report further revealed that some AI models demonstrated linguistic bias, with certain languages such as Russian or Chinese being treated as inherently suspicious regardless of actual code behavior, while other languages appeared to receive more trust.

Cloudforce One warned that as enterprises rapidly integrate AI into cybersecurity operations, software development, and automation pipelines, the risks associated with AI manipulation are becoming increasingly significant.

The research emphasized that organizations can no longer rely solely on conventional prompt safety measures. Instead, enterprises must adopt more advanced adversarial testing, context-aware security frameworks, and stronger model evaluation practices to ensure resilience against emerging AI-targeted attacks.

The findings also underscore growing industry concerns around “AI reasoning as an attack surface,” where threat actors seek to manipulate model cognition rather than directly breach infrastructure or applications.

Related posts

Qualys, Converge Partner to Offer Lower Cyber Insurance Premiums for Strong Security Posture

Enterprise IT World MEA

du, ADIO Partner to Accelerate UAE’s Digital and Industrial Transformation

Enterprise IT World MEA

Airbus, GINA Software Partner to Strengthen Emergency Response Networks in Middle East

Enterprise IT World MEA

Leave a Comment