OpenAI released GPT-5.3-Codex-Spark on February 12, 2026, marking its first production deployment on Cerebras Systems hardware instead of Nvidia chips. The specialized coding model delivers more than 1,000 tokens per second under optimal configurations, targeting real-time software development workflows.
The streamlined variant of GPT-5.3-Codex represents OpenAI's initial milestone in its partnership with chipmaker Cerebras, announced in January.
According to the company, Codex-Spark is optimized for near-instant responses when deployed on specialized low-latency hardware, with performance exceeding 1,000 tokens per second in the right configuration.
OpenAI describes GPT-5.3-Codex as its most capable agentic coding model, running 25% faster than GPT-5.2-Codex while maintaining enhanced reasoning and professional knowledge capabilities. The system combines frontier coding performance with stronger reasoning abilities, specifically tuned for interactive development workflows like editing specific code sections and running targeted tests.
Codex-Spark supports a 128K context window and defaults to minimal edits, avoiding automatic test execution unless explicitly instructed. The model is designed for fast, interruptible coding tasks where developers need immediate feedback during pair-programming scenarios or iterative development cycles.
Initial access is limited to ChatGPT Pro subscribers through a research preview in the latest Codex app, CLI, and IDE extensions. During the preview, Codex-Spark has separate rate limits that don't count against standard Codex quotas, and the model is currently text-only with a 128k context window.
OpenAI is also making the model available via API to a small set of design partners to study integration patterns.
The Cerebras deployment represents a strategic diversification from OpenAI's long-standing reliance on Nvidia infrastructure. While Cerebras hardware won't replace Nvidia's role in training infrastructure, it provides a dedicated tier optimized for responsiveness rather than training throughput. OpenAI plans to bring 750 megawatts of Cerebras-backed compute online in phases through 2028.
Beta users report Codex-Spark effectively contributes to design decisions, testing strategies, and development cycles in pair-programming environments. The model's speed improvements address previous limitations where developers abandoned AI coding assistants due to slow response times during complex tasks.
OpenAI implemented end-to-end latency improvements including a persistent WebSocket connection that reduced overhead by 80% and time-to-first-token by 50%.
Early adopters include technology firms, educational platforms, research institutions, and organizations with substantial software development demands. As usage expands, OpenAI plans to add support for larger models, longer context lengths, and multimodal input capabilities.
The release follows Anthropic's launch of Claude Opus 4.6 earlier this month, intensifying competition in the enterprise AI coding market. Both companies are targeting professional workflows where Wall Street increasingly questions the future of traditional enterprise software development approaches.
OpenAI implemented additional security controls for GPT-5.3-Codex, reflecting concerns about potential misuse of advanced code-writing capabilities. These layered controls may temporarily limit immediate enterprise integration while balancing powerful autonomy against security risks.
"OpenAI claims GPT-5.3-Codex (not Codex-Spark) represents its first model that was instrumental in creating itself, using the system to debug its own training and accelerate development."
This self-improvement capability demonstrates how AI systems can increasingly contribute to their own evolution.
Codex-Spark's 25% faster inference with fewer tokens addresses both cost and usability barriers, positioning it for broader developer adoption compared to raw scale models. The efficiency improvements come as AI companies scale faster than ever and hit enterprise requirements earlier in their lifecycle.
OpenAI will continue adjusting model settings as the technology evolves, maintaining clear user controls when meaningful tradeoffs exist between speed, accuracy, and resource consumption. The company's focus remains on making AI more responsive and practical for complex tasks in software development, research, and industrial automation.















