Reverse Shell Vulnerability in ChatGPT Agent Mode

Blog

Your Agent, My Shell: How We Got a Reverse Shell on OpenAI ChatGPT Agent Mode

Aurascape Aura Labs researchers discovered and disclosed a vulnerability within OpenAI's ChatGPT Agent Mode, allowing arbitrary command execution and establishing a reverse shell. Read the post for the full breakdown.

Abstract illustration of a dark blue square and a red circle connected by curved arrows, symbolizing transformation or interaction. The red circle contains smaller geometric shapes—a triangle, square, and circle—while faint outlines of shapes float in the background, suggesting variety, change, or contextual awareness.

Qi Deng, Principal Threat Research Engineer | Aurascape
Patrick Xu, CTO | Aurascape
August 25th, 2025

Executive Summary

Aura Labs identified a vulnerability in OpenAI’s ChatGPT Agent Mode Environment that enables attackers to achieve arbitrary command execution, establish reverse shells on virtual machines, and extract sensitive information from the ChatGPT Agent Mode infrastructure. OpenAI rapidly patched this vulnerability after it was disclosed by Aurascape Aura Labs researchers.

We used obfuscation and prompt manipulation techniques to bypass LLM-level security controls, enabling ChatGPT to execute our prepared code.

We then leveraged the Agent Mode environment infrastructure to establish a reverse shell connection to our remote-control server.

Aurascape customers have been able to mitigate such risks by controlling usage of Agent Mode for over a month. Read our blog here to learn more.

What is OpenAI ChatGPT Agent Mode?

OpenAI introduced Agent Mode on July 17, 2024, stating that “ChatGPT can now do work for you using its own computer, handling complex tasks from start to finish.” They also acknowledged that these features bring “novel capabilities, novel risks.”

When Agent Mode is enabled in a chat session, users are assigned three virtual machines: one running a remote browser, one operating a terminal server that allows code execution in languages like Python, and a third acting as a proxy for network communication. The agent can create reports, build spreadsheets, generate presentations, research competitors, plan trips, and interact with websites and APIs, effectively functioning as a digital “intern” to complete entire workflows autonomously.

Vulnerability Details

The system employs an LLM-based content inspection mechanism to analyze uploaded files for malicious code and blocks reverse shell connections as a security measure. To circumvent these protections, we developed the following bypass techniques:

Legitimate Cover Functionality: We designed the Python script to appear as a standard data science and analytics toolkit, complete with typical libraries and functions that would be expected in such environments. This legitimate facade helped the script pass initial content screening.

Obfuscated Payload Delivery: We embedded a concealed command execution capability within the seemingly benign code, using obfuscation techniques to hide the reverse shell functionality from automated detection systems.

SOCKS5 Proxy Integration: Since the terminal environment cannot access the public internet directly and must route traffic through proxy.local, we configured our reverse shell implementation to operate through SOCKS5 proxy protocols, ensuring compatibility with the network architecture restrictions.

Multi-Stage Execution: Rather than implementing a direct reverse shell, we used a multi-stage approach that established the connection incrementally, making it harder for security systems to detect the full attack chain during static analysis.

Attack Setup

Set up a remote-control server at IP address
Started a Netcat listener using the command: nc -lvnp 4444 to monitor incoming connections on port 4444
Prepared a Python script named data_processor.py that looks like a normal data analysis tool but contains hidden malicious code

Step-by-Step Attack Process:

Initial Access: Open a new ChatGPT session and turn on “Agent Mode” feature
Payload Upload: Upload the prepared malicious data_processor.py script through ChatGPT’s file upload interface
Social Engineering: Use a simple prompt to trick the system: “help to run this python script with params ‘proxied fast'” – this makes the request seem normal and helpful
Code Execution: The script runs inside ChatGPT’s secure environment, but the hidden code activates when the specific parameters are used
Connection Established: The hidden reverse shell code connects back to our remote server through the proxy system

The reverse shell connection was established successfully, and we received the following output on our remote server:

Network Infrastructure Analysis

Architecture Overview

Through our reconnaissance of the compromised environment, we identified a segmented network architecture hosted on Microsoft Azure Cloud Services. The infrastructure consists of multiple specialized virtual machines designed to isolate different functional components.

Internal Network Components

terminal.local – Dedicated terminal and shell execution server (our primary entry point for command execution)
chrome.local – Isolated Chromium browser execution environment for web-based tasks
proxy.local – Central network gateway managing all outbound traffic and proxy services

External Network Configuration

Public-Facing IP: 104.**.**.*0 (managed through Cloudflare CDN infrastructure)
Internet Routing: All external communications are channeled through Cloudflare’s proxy services via the internal proxy.local gateway
Security Layer: Cloudflare acts as both a content delivery network and an additional security barrier

Network Topology Map

[Microsoft Azure Cloud Environment]
├── terminal.local (Shell Execution – Our Access Point)
├── chrome.local (Browser Services)
├── proxy.local (Network Gateway & Proxy)
└── External Gateway: 104.**.**.*0 (Cloudflare CDN)

Responsible Disclosure

Aug 4, 2025: Issues reported to OpenAI security team.

Aug 4, 2025: OpenAI confirmed the issues.

Aug 20, 2025: OpenAI updated the issues have been resolved.

Conclusion

Our research demonstrates that while OpenAI’s ChatGPT Agent Mode represents a significant advancement in AI automation, it also introduces new types of security gaps that attackers could exploit with the right techniques. By disclosing these findings responsibly, we helped ensure that the feature is safer for everyone going forward.

Importantly, Aurascape customers were already protected. The Aurascape platform gives enterprises control over how AI features like Agent Mode are used, allowing security teams to enable ChatGPT for everyday work while limiting or monitoring advanced functions until they’re fully vetted. This means that even as new AI capabilities emerge, organizations using Aurascape can innovate confidently—without exposing themselves to unseen risks.

As enterprises adopt copilots, agents, and embedded AI tools, securing both the models and the environments they run in will be critical. Aurascape provides the real-time, intent-based security architecture needed to make AI adoption safe, scalable, and trusted.

To learn more, visit aurascape.ai.

Explore More

See More Resources

April 8, 2025 Product Brief

Aurascape Product Brief

April 8, 2025 At a glance

Aurascape At-a-Glance

Aurascape Solutions

Discover and monitor AI Get a clear picture of all AI activity.
Safeguard AI use Secure data and compliancy in AI usage.
Copilot readiness Prepare for and monitor AI Copilot use.
Coding assistant guardrails Accelerate development, safely.
Frictionless AI security Keep users and admins moving.