OpenAI has evaluated alternatives to Nvidia's AI chips for over a year, seeking hardware better optimized for inference workloads that generate responses from trained models.
The ChatGPT maker's search focuses on chips with large embedded SRAM memory that can accelerate response times for specific tasks like software development and AI-to-software communication.
Eight sources familiar with the matter told Reuters that OpenAI grew dissatisfied with Nvidia hardware speeds for certain inference operations. The company reportedly seeks alternative chips to handle approximately 10% of its future inference computing needs, targeting startups like Cerebras and Groq that design memory-centric architectures.
"Nvidia CEO Jensen Huang dismissed reports of tension as 'nonsense'"
over the weekend, reiterating plans for a significant investment in OpenAI. The chipmaker had previously signaled intentions to invest up to $100 billion in the AI startup last September, but negotiations have extended for months beyond the expected quick closure.
OpenAI CEO Sam Altman responded on social media, calling Nvidia's products "the best AI chips in the world" and expressing commitment to remaining a major customer. Both companies publicly maintain that Nvidia powers the majority of OpenAI's inference infrastructure and delivers leading performance-per-dollar at scale.
Performance limitations surfaced most visibly in Codex, OpenAI's AI-powered coding product where speed is critical for professional users. Altman noted on a January 30 call that coding customers "will put a big premium on speed for coding work," confirming the company would address this through its Cerebras partnership announced last month.
Inference has emerged as a new competitive frontier distinct from model training, where Nvidia maintains dominance with its GPUs. Inference workloads demand chips optimized for memory access rather than raw computation, exposing limitations in general-purpose GPU architectures that rely on external memory.
OpenAI explored partnerships with Cerebras and Groq for inference chips featuring substantial on-chip SRAM memory. However, Nvidia's $20 billion licensing agreement with Groq reportedly terminated OpenAI's discussions with that company, while Cerebras declined acquisition talks and instead struck a commercial deal with OpenAI.
The shift reflects broader industry changes as AI applications move from training to real-time deployment. Competing platforms like Google's Gemini and Anthropic's Claude benefit from greater reliance on custom chips like Google's tensor processing units, which are specifically designed for inference calculations.
Nvidia has responded by expanding its technology portfolio through licensing deals and talent acquisitions aimed at strengthening its position in inference-focused hardware. The company described Groq's intellectual property as complementary to its product roadmap while hiring away Groq's chip design staff.
OpenAI continues to rely on Nvidia for most inference operations while diversifying its hardware strategy through agreements with AMD, Broadcom, and Cerebras. The company's evolving product roadmap and emphasis on inference speed have reportedly complicated investment negotiations with Nvidia, extending timelines for the proposed $100 billion deal amid investor concerns.
Industry executives view Nvidia's moves as efforts to shore up technology portfolios amid intensifying competition in inference chips. As AI shifts toward real-time reasoning and large-scale deployment, this emerging battleground represents a critical test of Nvidia's long-held dominance in AI hardware.















