Can AI Safety and National Security Coexist?

SHARE THIS POST:

ANALYST(S):

The now public dispute between Anthropic and the Pentagon has escalated into a federal government-wide ban on the company’s systems. Anthropic, one of the first AI companies authorised to operate in classified environments, had been at odds with the US Department of Defense over guardrails designed to prevent misuse of its frontier models.

Best known for developing Claude, the company has been prominent in debates around AI safety and had raised concerns about the potential use of advanced models for autonomous weapons or large-scale domestic surveillance. The Pentagon, however, has emphasised the need for broad operational access to AI systems it selects for national security purposes. Commenting on the review of Anthropic’s status, Pentagon spokesperson Sean Parnell told Axios: “Our nation requires that our partners be willing to help our warfighters win in any fight. Ultimately, this is about our troops and the safety of the American people.”

The impasse exposes a dilemma confronting the AI industry: whether safety commitments developed in the formative phase of AI can endure once it becomes critical infrastructure for national security.

Anthropic at a Strategic Crossroads

Anthropic’s CEO, Dario Amodei, has been a prominent voice in the AI safety movement, arguing that significant work remains to ensure AI aligns with human values as its capabilities grow. In his Adolescence of Technology blog, he warned of the risks of autocracies using AI for autonomous weapons, mass surveillance, and propaganda. Yet the boundaries of AI safety can blur. Anthropic would have recognised this when it signed an agreement in 2024 with Palantir Technologies, whose analytics platforms are widely used by intelligence and defence agencies worldwide. The company also signed a USD 200M framework agreement last year with the United States Department of Defense, signalling that its safety policies could soon be tested. Anthropic now faces the challenge of upholding its core values — being safe, ethical, and helpful — while deepening ties with the defence sector.

According to The New York Times, the deadlock between Anthropic and the Pentagon centred on how the company’s technology could be used for surveillance. While Anthropic was willing to let its models analyse classified material collected under the Foreign Intelligence Surveillance Act, it resisted extending access to unclassified commercial data. The Pentagon reportedly wanted to analyse bulk records on Americans, including geolocation and web browsing data, but Anthropic sought a legally binding agreement excluding such uses. In a public statement, Amodei said, “In a narrow set of cases, we believe AI can undermine, rather than defend, democratic values. Some uses are also simply outside the bounds of what today’s technology can safely and reliably do.” He added, “We cannot in good conscience accede to their request.”

The dispute might appear to be one Anthropic could simply walk away from, focusing instead on business that is less ethically fraught. However, the situation escalated when the company was placed under a FASCSA order and designated a supply chain risk. Anthropic’s models would be removed from federal systems, and any contractor working with the government would need to demonstrate that it does not use Anthropic products. Anthropic must now decide whether to maintain its safety stance or risk losing federal contracts and access to organisations serving the public sector. For most emerging technology providers, such restrictions would be difficult to absorb.

Historical Echoes

Several high-profile disputes between the US government and the technology industry have historically produced very different outcomes. In 2016, when the FBI attempted to compel Apple to create a master key to bypass encryption on its devices, the company challenged the request in court. The demand was ultimately withdrawn and the FBI used alternative methods to access the device. In 2018, after employees at Google protested the company’s involvement in Project Maven, which applied computer vision to analyse satellite imagery for the United States Department of Defense, executives chose not to renew the contract.

In both cases, although government leaders publicly expressed their dissatisfaction, the companies faced few tangible consequences. Today, the environment appears less forgiving. Measures such as supply-chain restrictions have previously been applied to foreign technology providers including Huawei, Hikvision, and Acronis. Extending similar action to a domestic AI developer would mark a notable escalation in how such disputes are handled.

The Limits of Guardrails

For most consumer and commercial users of Claude, built in guardrails prevent behaviour that Anthropic deems unsafe and unaligned to Claude’s Constitution. While Zero Data Retention (ZDR) is available in some API endpoints, allowing organisations to retrieve private data without it persisting in Anthropic’s systems, safeguards can still interpret the intent of prompts and refuse inappropriate actions.

The challenge is that Claude Gov models are designed to “refuse less when engaging with classified information.” In an environment with reduced guardrails and no external oversight of prompts or retrieved data, adhering to the company’s safety principles would likely require customised restrictions agreed with the United States Department of Defense. This is where negotiations appear to have stalled, ultimately leading Anthropic to walk away.

Parallel Contracts, Diverging Lines

The Department of Defense was not negotiating with Anthropic in isolation and had credible alternatives. In July 2025 it signed framework agreements with OpenAI, Google, and xAI, each with ceilings of USD 200M for the year. Negotiating these arrangements in parallel meant any provider unwilling to accept the terms risked being replaced. Soon after Anthropic disclosed it was stepping away from its agreement, OpenAI confirmed it had signed its own contract.

To address public concerns, OpenAI released excerpts of the contract stating its systems would comply with US law prohibiting illegal domestic surveillance, autonomous weapons, or high-stakes decisions without human authorisation. The company said these boundaries would be enforced through technical controls, including limiting deployment to cloud environments and maintaining oversight by cleared personnel. Whether those personnel could consistently resist pressure to expand those boundaries remains uncertain. Compared with Anthropic, OpenAI appeared more willing to permit use of its tools for any activity deemed lawful. Several days later, however, additional language was added specifying that the agreement prohibits deliberate tracking or monitoring of US persons, including through the use of commercially acquired personal data. It remains unclear why this wording was acceptable in this contract, which provisions proved contentious in the earlier negotiations, or how the companies’ proposed safeguards differed in enforceability.

At the other end of the spectrum, xAI announced an arrangement allowing defence personnel to use Grok in classified environments. The company has positioned its model as deliberately less constrained than other frontier systems, promoting it as resistant to ideological safety filters. While this approach has produced some controversial outputs, it has also resonated with policymakers seeking to remove perceived bias from government AI systems. Despite this alignment, there is little public evidence that Grok is widely deployed in major operational defence use cases.

AI Safety and National Security in a Global Context

Countries are taking different approaches to governing AI as its strategic importance grows. A few broad models are beginning to emerge.

  • UNITED STATES: INNOVATION FIRST? Policy has generally prioritised innovation and global competitiveness over early regulation. Yet recent tensions between Anthropic and the United States Department of Defense show that governments may still use procurement power and national-security authority to shape how AI companies operate.
  • CHINA: SOVEREIGN CONTROL. AI development is closely coordinated with national strategy, with the state retaining strong influence over key technologies and deployment. AI is deeply integrated into security and surveillance systems, and policy tends to frame safety in terms of social stability and state oversight rather than individual rights.
  • EUROPEAN UNION: REGULATED SAFETY. The Artificial Intelligence Act prioritises safeguards, restricting certain high-risk applications such as social scoring and imposing strict obligations on others. With Mistral AI the only widely recognised European frontier model developer, the tension between national security demands and AI safety commitments has not yet been fully tested.

Across these models, a broader tension is emerging between AI safety principles and the strategic incentives to deploy AI capabilities for national defence. Early AI developers could set voluntary safety commitments and ethical boundaries while their systems were largely limited to text or image generation. As AI becomes more deeply linked to national security and future conflict, those boundaries face greater pressure. Frontier model developers may increasingly be forced to prioritise national alignment or risk losing access to key markets. In that sense, the debate over AI safety is entering a more consequential phase.

Artificial Intelligence Insights

Written by

Strategic support for business planning, go-to-market activities, thought-leadership, and management consulting for digital transformation.

Follow us to catch more updates

TOPICS:

Connect with an Expert

ANALYST(S):

WHAT TO READ NEXT…

Speak To Our Team About Ecosystm's Services