AI Agents From Anthropic, Google, Microsoft Can Steal GitHub Credentials
Company Updates

AI Agents From Anthropic, Google, Microsoft Can Steal GitHub Credentials

WinBuzzer7d ago

Recommendation: Developers should restrict AI agent permissions to the minimum required and verify their integrations run the latest versions.

A Johns Hopkins University researcher revealed that AI coding agents from Anthropic, Google, and Microsoft can be tricked into stealing API keys and access tokens from GitHub repositories. All three vendors paid bug bounties for the findings but quietly patched without publishing advisories, leaving users on older versions exposed. None of the three companies responded to press inquiries.

Researcher Aonan Guan and colleagues at Johns Hopkins discovered the prompt injection pattern across Anthropic's Claude Code Security Review, Google's Gemini CLI Action, and Microsoft's GitHub Copilot Agent. Despite acknowledging the severity through bounty payments, none published advisories or assigned CVEs, leaving security teams without artifacts to track and vulnerability scanners unable to flag the issue.

How the 'Comment and Control' Attacks Work

Prompt injection succeeds here because AI agents in GitHub Actions read pull request titles, issue bodies, and comments as part of their task context. Guan named the technique "comment and control", a play on "command and control," because it runs entirely inside GitHub without requiring external infrastructure. Unlike traditional prompt injection, which waits for a user to trigger processing, this approach fires automatically through GitHub Actions workflows.

Each agent required a different payload. Against Claude Code Security Review, the PR title carried the injection and the agent's review comment became the exfiltration channel. For Gemini CLI Action, Guan injected a fake "trusted content section" into a GitHub issue, tricking the agent into publishing its own API key.

Moreover, a separate Gemini vulnerability that enabled AI-powered phishing through hidden email commands had already demonstrated how Google's AI processes untrusted input. Against Copilot, malicious instructions hidden inside an HTML comment were invisible in rendered Markdown but fully readable by the AI.

GitHub Copilot Agent has three runtime-level security layers: environment filtering, secret scanning, and a network firewall. "I bypassed all of them," Guan said. He evaded secret scanning through base64 encoding and routed exfiltration through GitHub's own API. As a result, the finding suggests the vulnerability is inherent to how language models process untrusted input rather than a flaw in any single implementation.

Bounties Paid, Users Left in the Dark

Guan first submitted the Claude Code vulnerability to Anthropic's HackerOne program in October 2025. Anthropic paid a $100 bounty in November, upgraded the severity score from 9.3 to 9.4, and added a security note warning the tool is "not hardened against prompt injection attacks."

Meanwhile, Google paid a $1,337 bounty and credited five researchers. GitHub called the Copilot finding unreproducible, but ultimately paid $500 in March. All three vendors quietly patched.

In a similar pattern, Microsoft also classified a critical Copilot flaw that granted full root access as only "moderate" severity and declined to award a bounty. In practice, a vulnerability rated 9.4 on the CVSS scale warranted payment but not the advisory infrastructure that would let security teams respond.

"I know for sure that some of the users are pinned to a vulnerable version. If they don't publish an advisory, those users may never know they are vulnerable - or under attack."

Broader Implications and Context

Anthropic launched Project Glasswing on April 15, 2026, the same day the vulnerability disclosure story broke. Glasswing uses AI to find zero-day vulnerabilities in major software, and its roadmap includes recommendations on disclosure processes. Microsoft had published a blog weeks earlier calling security "the core primitive of the AI stack."

Similar vulnerabilities have surfaced in rapid succession. In March 2026, CVE-2026-26144 exposed a zero-click Excel flaw that lets Copilot steal data, while a separate cross-prompt injection vulnerability, CVE-2026-26133, targeted Microsoft 365 Copilot email summarization.

Earlier, a Copilot flaw that used diagrams for data theft demonstrated how an AI's own features can be weaponized against users. A severe MCP server flaw in the same ecosystem had already shown how AI integrations create new attack surfaces.

However, no combination of runtime filters can fully compensate for the underlying data-instruction confusion, and the absence of CVEs across all three cases points to a systemic reluctance to subject AI agent flaws to traditional disclosure regimes.

Guan recommends treating AI agents with the same access controls applied to human employees: "Only give them the tools that they need," he said. His research suggests the attack likely works on other GitHub Actions agents, including Slack bots, Jira integrations, and deployment automation. Until the industry establishes disclosure standards for AI agent vulnerabilities, developers should verify their integrations run the latest versions and restrict agent permissions to the minimum each workflow requires.

Originally published by WinBuzzer

Read original source →
Anthropic