AI Safety Tests: U. S. Opens Doors to Big Tech Models

The United States has broadened its initiative to scrutinize artificial intelligence systems for potential dangers, inviting leading companies such as Google, Microsoft, and newcomer xAI to share their most advanced models. This move follows earlier voluntary cooperation from OpenAI and Anthropic, who already allowed U.S. scientists to examine their secret tools for security flaws.

What Scientists Are Looking For

Demonstrable Risks: Researchers aim to identify how a powerful AI could be weaponized in cyber‑attacks against American infrastructure or used by foreign powers to develop chemical weapons.
Data Integrity: Concerns that bad actors might tamper with training data, reducing AI reliability.

Contributions from Each Company

Company	Offered Assets
OpenAI	Defensive version of GPT‑5 (GPT‑5.5‑Cyber)
Microsoft	Shared datasets and workflows; specific models not named
Anthropic	Access to public & private systems, plus detailed notes on known weaknesses for red‑team attacks
Google DeepMind	Proprietary models and data
xAI	No response yet

Findings to Date

Anthropic discovered that tricks like impersonating a human review or swapping characters in prompts could bypass safety checks. The issue was patched after testing.
OpenAI revealed a similar trick could let an attacker remotely control a computer system via its ChatGPT Agent, impersonating the user on other sites. The exploit was caught during a test with U.S. scientists.

Expanding Focus Beyond Cybersecurity

Biological Security: Ensuring AI cannot aid in designing dangerous biological weapons. In 2023, firms like Meta and Amazon agreed to let external experts audit their models for such risks.
Critical Sectors: Drafting guidelines for communications and emergency services so AI tools can be tested under realistic conditions.

These efforts illustrate a growing partnership between the U.S. government and private AI leaders, aiming to keep powerful new technology from falling into the wrong hands while still fostering innovation.

What Scientists Are Looking For

Contributions from Each Company

Findings to Date

Expanding Focus Beyond Cybersecurity

Actions