Google was calling itself an "AI-first" company beginning in 2016 or 2017. They designed and built TPUs nearly a decade ago and were using transformer models in products like Google Translate but didn't make a big fuss about it, it just made the product way better. People should at least credit Sundar somewhat for this, it turned out to be quite prescient, especially the advantage of having your own chips that are specifically designed for ML.
AI was very different in 2016-2017 compared to what it is since ChatGPT. Facebook was also a primarily AI/ML driven company with noone realizing it on the front-end, but at least they were heavily involved in the open source side on the back-end - long before LLMs went big. In fact they enabled them to go big with things like pytorch. Google just stumbled into this. Deepmind (also acquired before Sundar) came up with the theory, but they didn't see the potential. What you call "prescience" I call luck. They did not create the demand for their own technology like e.g. Nvidia did by pushing the field ahead with full force. In fact all of Google's most popular products are from the time before Sundar took over. Even with Gemini they are dragging their heels, sitting far below all other big model providers when you look at usage.
This is a bizarre accounting of things. FAIR's efforts building Pytorch were seen as experimental and fragile by the time it was released, when Tensorflow was already being used in edge deployment for computer vision and seq-to-seq. Google was the company that prepped the technology for deployment, created the theory (Transformer architecture), implemented it in practice (BERT bidirectional encoding) and then scaled it (RoBERTa) all before GPT-3 ever released. Three years before Facebook released Llama.
> They did not create the demand for their own technology like e.g. Nvidia did by pushing the field ahead with full force.
They did, though. You are commenting on an eighth-generation TPU product that has been used millions of times a day for the past half-decade. It's likely that this will be the hardware providing inference for Apple's Gemini model they've selected to use with Siri. TPUs are the economically-conscious inference choice if you've already separated your training/inference workflows.