OpenAI unveils first custom AI inference chip, Jalapeño
Digest more
AI inference infrastructure investment pulled $1.8 billion in 48 hours as Baseten’s $1.5B round at a $13B valuation and General Intuition’s $300M gaming-data raise confirm open-source model serving is now where venture capital is placing its biggest bets in the AI stack,
The ChatGPT maker spent nine months on its own silicon, and it is aimed at one slice of Nvidia's empire.
Baseten’s latest fundraising will support its multi-model AI inference platform and expand hiring across engineering and operations.
AMD Ryzen AI Max+ 395 runs 235B-parameter models on x86, letting developers cut $440-per-month cloud subscriptions. AMD first-party system starts at USD 3,999; the GMKtec EVO-X2 uses the same chip for under USD 1,
In September 2024, OpenAI previewed a model that behaved differently from the AI systems most people had grown accustomed to. Instead of instantly generating an answer, it appeared to pause, deliberate,
AI inference is undergoing the same transformation that cloud infrastructure experienced a decade ago. Open-weight models have expanded who runs AI — neoclouds, regulated enterprises, and AI-native companies now operate their own GPU fleets across multiple clouds and on-premise environments.
Demand for AI inference compute workloads is increasing rapidly, and Nvidia is dominating the market despite competition from AMD and Broadcom.
Perplexity AI unveiled a hybrid local-cloud inference system at Computex 2026 that automatically routes AI tasks between a user’s device and the cloud, signaling a major shift in enterprise AI, privacy,
Baseten Inc., a startup with a platform for running artificial intelligence inference workloads, is raising $1.5 billion in funding.
Sequential decision problems distill important challenges frequently faced by humans. Through repeated interactions with an uncertain world, unknown statistics need to be learned while balancing exploration and exploitation. Reinforcement learning is a ...
