And Apple is the Hamburglar
Apple’s AI strategy is even more brilliant than Nvidia’s.
It’s built on a three-tier routing system. When you ask Siri to do something, a built-in orchestrator in the operating system decides how complex the task is. According to third-party estimates, around 85% of requests are handled on your Apple device by Apple’s own small, efficient models. (It does things like summarizing text, prioritizing notifications, cleaning up photos, or suggesting replies.) Roughly 12% of all queries get sent to Private Cloud Compute, Apple’s own server infrastructure running Apple’s larger models on Apple silicon in Apple-owned data centers. Only the hardest 3% of queries get routed to an external partner model.
This design lets Apple avoid the ruinous cost of training a frontier model from scratch. Microsoft, Google, Meta, and Amazon each spend tens of billions of dollars per year on GPU clusters, energy, and research teams to build and run trillion-parameter models. Apple doesn’t. Its own models are deliberately small and run on chips Apple already sells you, so the inference cost is basically absorbed into the device. It only needs a frontier model for that tiny sliver of hard queries, which is where the partnership strategy kicks in.
Read the full article here

