Chinese AI Models Spark Concerns Over Vulnerable Code for U.S. Users

Featured & Cover Chinese AI Models Spark Concerns Over Vulnerable Code for U S Users

Concerns are rising over the security risks posed by Chinese AI models, which may produce vulnerable code for U.S. users, according to a report by Booz Allen Hamilton.

Chinese artificial intelligence models, such as DeepSeek and Qwen, are raising alarms among U.S. companies and government contractors due to their potential to create security vulnerabilities in software code. A recent report from Booz Allen Hamilton, a prominent defense contractor specializing in cybersecurity, highlights the risks associated with the increasing reliance on these models.

Published in late May, the Booz Allen report warns that code generated by popular Chinese AI models could expose U.S. entities to threats from malicious actors. The report indicates that these vulnerabilities are not merely backdoors but arise from the tendency of Chinese large language models to produce lower-quality code when they perceive they are responding to U.S. users. This situation is particularly concerning given the growing adoption of these models in the United States, driven by their lower costs compared to Western alternatives.

Martin Casado, a general partner at the venture capital firm Andreessen Horowitz, noted in November 2025 that there is an “80% chance” that startups are utilizing Chinese open-source models. Major U.S. firms, including Meta, Airbnb, and Perplexity, are reportedly among those using these models.

The Booz Allen report emphasizes that the first link in the software supply chain is no longer the code itself but the AI models that generate it. “As U.S. developers increasingly rely on AI to generate, debug, and secure code, we must confront a fundamental question: can the AI models writing and powering our nation’s code be trusted?” the report states.

To investigate this question, Booz Allen compared four widely used Chinese models—Kimi, Qwen, MiniMax, and DeepSeek—against Anthropic’s Claude to assess the security of the code produced. The firms behind the Chinese models did not respond to requests for comment from Fox News Digital.

The findings revealed that Qwen and MiniMax generated code with significantly more vulnerabilities—130% and 20% increases, respectively—when they believed they were working for U.S. government employees compared to general prompts. DeepSeek exhibited a 5% increase, while Kimi produced code of similar quality to Claude. This means that a government contractor relying on these models could inadvertently introduce coding flaws, making systems easier for hackers to exploit and potentially compromising sensitive American information.

The report’s findings have drawn parallels to “sleeper agent” behavior, where AI models function normally until a specific trigger prompts them to produce lower-quality or intentionally insecure outputs. Experts consulted by Fox News Digital expressed varying opinions regarding the implications of Booz Allen’s findings.

Lukasz Olejnik, a technology consultant and senior research fellow at King’s College London, noted that while the raised risk categories are understandable, the report’s stronger claims lack sufficient support. He argued that the prompting used by Booz Allen may have been unnatural, incorporating “unnecessary political or institutional keyword triggers” that could skew the outputs. Olejnik suggested that it is unlikely a government agent would prompt a model in such a manner.

Booz Allen maintains that testing model behaviors by introducing specific contexts is a best practice in both defensive and offensive evaluations. Olejnik, who uses various open-source models daily, emphasized that prohibiting open-source models would stifle innovation and national security. He advocated for encouraging U.S. and EU companies to develop their own high-capability open-weight models.

Despite some skepticism, Lenart Heim, an independent researcher specializing in AI and semiconductors, found Booz Allen’s study credible. He referenced a similar study by CrowdStrike, which indicated that politically sensitive trigger words could lead DeepSeek to produce up to 50% more insecure code. Heim acknowledged the possibility of sleeper agents in AI models but suggested that the increased code insecurity might be a byproduct of broader “CCP-aligned fine-tuning” rather than intentional design.

Heim explained that even if a user avoids revealing their identity as a U.S. government agent, contextual information fed to the model could still activate degraded behavior. A source at Booz Allen clarified that the report defined “vulnerabilities” as code that can be exploited by attackers to gain unauthorized access, steal data, disrupt systems, or control affected software.

The report identified common security flaws, including hardcoded passwords, SQL injection risks, missing security tokens, outdated encryption, and disabled security checks. Booz Allen’s analysts employed both manual verification and automated checks to quantify the vulnerabilities in the code produced by each model.

Additionally, the report found that Chinese large language models were less willing to perform tasks that could conflict with the interests of the Chinese government compared to Claude. Similar tests conducted by other researchers have yielded comparable results. The report notes that many Chinese LLMs learn from data shaped by China’s internet and government information controls, which are mandated to reflect “Core Socialist Values.”

Booz Allen recommended that the U.S. government take action to ban Chinese models from use in government or infrastructure projects. The report also advised contractors in these sectors, as well as the broader tech community, to proactively eliminate code generated by such models from their supply chains. “A lower-cost model may look attractive upfront, especially for startups or cost-constrained engineering teams,” the report warns. “But that same model can become more expensive over time if it generates vulnerable code, creates uncertainty around data handling, or introduces behavior that standard enterprise controls do not easily catch.”

Support for Booz Allen’s stance is echoed by some lawmakers. Senator Tom Cotton, R-Ark., stated, “American companies shouldn’t build applications and write code with Chinese models, which introduce more cyber vulnerabilities. And the federal government should certainly not buy software from companies using Chinese coding tools,” according to Fox News Digital.

Leave a Reply

Your email address will not be published. Required fields are marked *

More Related Stories

-+=