From NVIDIA GTC Taipei 2026 Keynote | UNCUT · · Replay
“Tokens are now in extraordinary demand. Because if you could do this, you're going to want to produce more of it. And because tokens are now profitable units, tokens are now profitable units of revenues. because it is now profitable. The AI companies want to build a lot more tokens, generate a lot more tokens, build more AI factories, which is the reason why compute demand here in Taiwan has skyrocketed.”
On , William Dally, Chief Scientist & Senior Vice President of Research at NVIDIA, spoke about AI token economics during NVIDIA GTC Taipei 2026 Keynote | UNCUT on Replay.
William Dally, Chief Scientist and Senior Vice President of Research at Nvidia, gave a keynote at GTC Taipei in May 2026 where he discussed the economics of AI factories, stating that tokens "are now profitable units of revenues" and that compute demand in Taiwan has "skyrocketed" as a result. He estimated the cost of a single gigawatt-level AI factory has risen from $30 billion to between $60 and $100 billion, and argued that "compute is revenues" and "performance per watt is your revenues," cautioning against choosing architecture solely on chip cost. Dally also said the number of software engineers is increasing, describing claims that AI reduces jobs as "complete nonsense," citing the productivity gain of $3 trillion worth of software engineer salary generating $9 trillion in output. In a June 2026 lecture at the National University of Singapore, Dally attributed the deep learning revolution to GPU hardware enabling algorithms and data that had existed for decades, and stated that progress remains "gated by how fast a GPU is." He contrasted Nvidia's product development, where "it has to work or we're going out of business," with Nvidia Research, where the ability to fail allows for innovations that can achieve "2x or 4x performance per unit energy on the next generation." He also referenced an earlier 2020 talk where he described a 317x increase in single-chip inference performance over eight years, a trend he termed "Huang's Law," and credited specialized tensor core instructions for allowing GPUs to achieve efficiency near that of dedicated hardware.