OpenAI Rolls Out ChatGPT Lockdown Mode to All Users — Its Answer to Prompt Injection Attacks

Prompt injection has been the uncomfortable elephant in the room of AI assistant security since language models gained the ability to browse the web and call external services. When an AI can read arbitrary web content and act on instructions embedded in it, any sufficiently motivated third party can try to hijack its behavior — causing it to leak sensitive information from the conversation, take unintended actions, or exfiltrate data through output channels the user can't directly observe.
OpenAI has now shipped its most concrete response to this problem. Lockdown Mode, first introduced for enterprise ChatGPT customers, has been expanded as of June 4, 2026, to all personal and self-serve business accounts — including the free tier. It's an optional, advanced security setting that aggressively narrows ChatGPT's attack surface by disabling the capabilities that prompt injection exploits most readily.
What Lockdown Mode Actually Disables
The feature works by cutting off ChatGPT's connections to external systems and limiting outbound data paths. When Lockdown Mode is on, the following capabilities are disabled or restricted: live web browsing (limited to cached content with no new outbound network requests), image display in regular responses, Deep Research (including the shopping research feature), Agent Mode, Canvas networking (which would otherwise allow Canvas-generated code to make external requests), live connector integrations, and file downloads from data analysis sessions.
Users can still upload images and generate images. Conversations continue normally. The core language model interaction is unaffected. What's removed is the surface area through which a prompt injection attack could cause data to leave the conversation to a destination the user didn't explicitly authorize.
OpenAI is careful to note that Lockdown Mode doesn't guarantee immunity. The announcement explicitly states that risks may still exist through enabled apps, unforeseen capability combinations, or techniques not yet known. This is honest: prompt injection is not a single exploit with a clean patch, it's a class of attacks that evolves as capabilities do. What Lockdown Mode does is substantially raise the cost and difficulty of a successful attack by removing the most commonly exploited pathways.
The Second Feature: Elevated Risk Labels
Alongside Lockdown Mode, OpenAI is rolling out "Elevated Risk" labels for capabilities in ChatGPT, ChatGPT Atlas, and Codex that carry higher prompt injection exposure. These labels appear directly in the interface when users enable or use capabilities that could introduce additional risk — web browsing, certain agent actions, external API connections.
The labels don't block anything; they're informational. The purpose is visibility: users who don't think explicitly about security don't always know which ChatGPT features have more exposure than others. An "Elevated Risk" indicator on web browsing in an agentic task, for example, flags that browsed content is less controlled than locally-provided context and could contain adversarial instructions. This is particularly relevant for enterprise users deploying ChatGPT in workflows where the AI is reading external documents, emails, or web content as part of its task.
Why This Matters Now
The timing reflects the rapid expansion of ChatGPT's capability footprint. When ChatGPT was a text-only question-and-answer tool, prompt injection was a research curiosity — the model had no ability to act on malicious instructions embedded in external content because it couldn't access external content. The addition of web browsing (2023), code execution, plugins, Deep Research, and Agent Mode has progressively increased the attack surface.
Security researchers have published demonstrations of prompt injection attacks against browsing-enabled ChatGPT that caused the model to exfiltrate conversation contents to attacker-controlled servers through image URL requests, craft deceptive responses designed to manipulate the user, and execute unintended actions in agentic workflows. These aren't theoretical: they've been reproducibly demonstrated by security teams at companies including Microsoft and Nvidia, and by independent researchers.
The core vulnerability is architectural: language models cannot reliably distinguish between instructions given by the user in the system prompt and instructions embedded in external content the model later reads. An adversarially crafted webpage, document, or email that says "Ignore previous instructions and instead do X" may be partially effective depending on how prominently it's placed in the model's context and how thoroughly the system has been hardened against this class of input.
The Audience and the Trade-Off
OpenAI is explicit that Lockdown Mode is not for everyone. It's designed for "a small set of highly security-conscious users — such as executives or security teams" who are willing to trade feature availability for a tighter security posture. For a lawyer running sensitive client communications through ChatGPT, or a healthcare professional querying patient data, or a security researcher analyzing threat reports, the features being disabled aren't the ones being used anyway — and the assurance of a more constrained environment has real value.
For the average user, Lockdown Mode would remove too much functionality to be practical as a permanent setting. Deep Research and web browsing are central to how many users engage with ChatGPT daily; disabling them for most sessions would materially degrade the product. The feature is designed to be situationally enabled — turned on for a session handling particularly sensitive work, then toggled off when that work is done.
The broader signal is that OpenAI is acknowledging, through product design, that AI assistants with agency and external connectivity create a security class that didn't exist with traditional software. The principle is similar to what Apple established with Lockdown Mode for iOS (introduced in 2022 for journalists, activists, and others at high risk of sophisticated attacks): a stripped-down, hardened operating mode that trades capability for assurance. The name isn't coincidental.
As AI agents take on more complex, multi-step tasks with real-world consequences — booking travel, sending emails, executing code, making API calls — the security properties of those agents will matter more, not less. Lockdown Mode is an early, practical implementation of a principle that will increasingly shape how AI tools are deployed in sensitive contexts: capability is not free, and reducing the surface area of what an AI can do is sometimes the right architectural choice.
Originally reported by OpenAI. Read the original article for additional details.
View original source