英伟达(NVDA) GTC 金融分析师问答 - (文字记录)

发布于: 雪球转发:0回复:0喜欢:2



Jensen Huang

Good morning. Nice to see all of you. All right. What's the game plan?


Colette Kress

Colette Kress

Okay. Well, we've got a full house and we're thanking you all for coming out for our first in-person in such a long time. Jensen and I are here to kind of really go through any questions that you have, questions from yesterday.

好的。 好,我们满座了,感谢大家在这么长时间以来第一次亲临现场。 Jensen 和我在这里,准备回答您的任何问题,也是为了回答昨天提出的问题。

And we're going to go through a series of folks that are going to be in the aisles that you can just reach-out to us, raise your hand, we'll get to you with a mic and Jensen are here to answer any questions from yesterday.


We thought that would be a better plan for you. I know you have already asked quite a few questions, both last night and this morning, but rather than giving you a formal presentation, we're just going to go through of good Q&A today. Sound like a good plan.


I'm going to turn it to Jensen to see if he wants to add some opening remarks because we have just a quick introduction. We'll do it that way. Okay.


Jensen Huang

Jensen Huang

Yeah. Thank you. First, great to see all of you. There were so many things I wanted to say yesterday and probably have said -- and wanted to say better, but I got to tell you, I've never presented at a rock concert before. I don't know about you guys, but I've never presented in a rock concert before. The -- I had simulated what it was going to be like, but when I walked on stage, it still took my breath away. And so anyways, I did the best I could.


Next, after the tour, I'm going to do a better job, I'm sure. I just need a lot more practice. But there were a few things I wanted to tell you. Is there a clicker -- oh, look at that. See, this is like spatial computing. It's -- by the way, if you get -- I don't know you'll get a chance, because it takes a little step up, but if you get a chance to see Omniverse in Vision Pro, it is insane. Completely incomprehensible how realistic it is.

接下来,游览完之后,我一定会做得更好的。我只是需要更多的实践。不过有几件事我想告诉你。你有一个点击器吗——哦,看这个。这就像是空间计算。顺便说一句,如果你有机会(因为需要一点提升),看到Omniverse in Vision Pro,简直太疯狂了。它的真实感完全难以置信。

All right. So we spoke about five things yesterday and I think the first one really deserves some explanation. I think the first one is, of course, this new industrial revolution. There were two -- there are two things that are happening, two transitions that are happening. The first is moving from general purpose computing to accelerated computing. If you just looked at the extraordinary trend of general-purpose computing, it has slowed down tremendously over the years.


And in fact, we've known that it's been slowing down for about a decade and people just didn't want to deal with it for a decade, but you really have to deal with it now. And you can see that people are extending the depreciation cycle of their data centers as a result. You could buy a whole new set of general purpose servers and it's not going to improve your throughput of your overall data center dramatically.


And so you might as well just continue to use what you have for a little longer. That trend is never going to reverse. General purpose computing has reached this end. We're going to continue to need it and there's a whole lot of software that runs on it, but it is very clear we should accelerate everything we can.


There are many different industries that have already been accelerated, some that are very large workloads that we really would like to accelerate more. But the benefits of accelerated computing is very, very clear.


One of the areas that I didn't spend time on yesterday that I really wanted to was data processing. NVIDIA has a suite of libraries that before you could do almost anything in a company, you have to process the data. You have to, of course, ingest the data, and the amount of data is extraordinary. Zettabytes of data being created around the world, just doubling every couple of years, even though computing is not doubling every couple of years.


So you know that data processing, you're on the wrong side of that curve already on data processing. If you don't move to accelerated computing, your data processing bills just keep on going up and up and up and up. And so for a lot of companies that recognize this, AstraZeneca, Visa, Amex, Mastercard, so many, so many companies that we work with, they've reduced their data processing expense by 95%, basically 20 times reduction.


To the point the acceleration is so extraordinary now with our suite of libraries called rapids, that the inventor of Spark, who started a great company called Databricks, and they are the cloud large scale data processing company, they announced that they're going to take Databricks their photon engine, which is their crown jewel and they're going to accelerate that with NVIDIA GPUs.

目前,我们的名为 Rapids 的库套件加速得如此惊人,以至于 Spark 的发明人,他创建了一家名为 Databricks 的伟大公司,他们是云大规模数据处理公司,他们宣布他们将利用 NVIDIA GPU 加速他们的皇冠上的明珠 - Databricks 的 Photon 引擎。

Okay. So the benefit of acceleration, of course, pass along savings to your customers, but very importantly, so that you can continue to sustainably compute. Otherwise, you're on the wrong side of that curve. You'll never get on the right side of the curve. You have to accelerate. The question is today or tomorrow? Okay. So accelerated computing. We accelerated algorithms so quickly that the marginal cost of computing has declined so tremendously over the last decade that it enabled this new way of doing software called generative AI.


Generative AI, as you know, requires a lot of flops, a lot of flops, a lot of computation. It is not a normal amount of computation, an insane amount of computation. And yet it can now be done cost effectively that consumers can use this incredible service called ChatGPT. So, it's something to consider that accelerated computing has dropped, has driven down the marginal cost of computing so far that enabled a new way of doing something else.


And this new way is software written by computers with a raw material called data. You apply energy to it. There's an instrument called GPU supercomputers. And what comes out of it are tokens that we enjoy. When you're interacting with ChatGPT, you're getting all -- it's producing tokens.


Now, that data center is not a normal data center. It's not a data center that you know of in the past. The reason for that is this. It's not shared by a whole lot of people. It's not doing a whole lot of different things. It's running one application 24/7. And its job is not just to save money, its job is to make money. It's a factory.


This is no different than an AC generator of the last industrial revolution. And it's no different than the raw material coming in is, of course, water. They applied energy to it and turns into electricity. Now it's data that comes into it. It's refined using data processing, and then, of course, generative AI models.


And what comes out of it is valuable tokens. This idea that we would apply this basic method of software, token generation, what some people call inference, but token generation. This method of producing software, producing data, interacting with you, ChatGPT is interacting with you.


This method of working with you, collaborating with you, you extend this as far as you like, copilots to artificial intelligence agents, you extend the idea as long as you like, but it's basically the same idea. It's generating software, it's generating tokens and it's coming out of this thing called an AI generator that we call GPU supercomputers. Does that make sense?


And so the two ideas. One is the traditional data centers that we use today should be accelerated and they are. They're being modernized, lots and lots of it, and more and more industries one after another. And so what is a trillion dollars of data centers in the world will surely all be accelerated someday. The question is, how many years would it take to do? But because of the second dynamic, which is its incredible benefit in artificial intelligence, it's going to further accelerate that trend. Does that make sense?


However, the second data center, the second type of data center called AC generators or excuse me, AI generators or AI factories, as I've described it as, this is a brand new thing. It's a brand new type of software generating a brand new type of valuable resource and it's going to be created by companies, by industries, by countries, so on and so forth, a new industry.


I also spoke about our new platform. People are -- there are a lot of speculations about Blackwell. Blackwell is both a chip at the heart of the system, but it's really a platform. It's basically a computer system. What NVIDIA does for a living is not build the chip. We build an entire supercomputer, from the chip to the system to the interconnects, the NVLinks, the networking, but very importantly the software.


Could you imagine the mountain of electronics that are brought into your house, how are you going to program it? Without all of the libraries that were created over the years in order to make it effective, you've got a couple of billion dollars' worth of asset you just brought into your company.


And anytime it's not utilized is costing you money. And the expense is too incredible. And so our ability to help companies not just buy the chips, but to bring up the systems and put it to use and then working with them all the time to make it -- put it to better and better and better use, that is really important.


Okay. That's what NVIDIA does for a living. The platform we call Blackwell has all of these components associated with it that I showed you at the end of the presentation to give you a sense of the magnitude of what we've built. All of that, we then disassemble. This is the hard -- this is the part that's incredibly hard about what we do.

好的。这就是 NVIDIA 的生意。我们称之为 Blackwell 的平台有我在演示结尾向您展示的所有这些组件,让您了解我们已经建立的规模。然后我们把所有这些都拆掉。这就是我们工作中极其困难的部分。

We build this vertically integrated thing, but we build it in a way that can be disassembled later and for you to buy it in parts, because maybe you want to connect it to x86. Maybe you want to connect it to a PCI-Express fabric. Maybe you want to connect it across a whole bunch of fiber, okay, optics.


Maybe you want to have very large NVLink domains. Maybe you want smaller NVLink domains. Maybe you can use arm, maybe so on and so forth. Does it make sense? Maybe you would like to use Ethernet. Okay, Ethernet is not great for AI. It doesn't matter what anybody says.


You can't change the facts. And there's a reason for that. There's a reason why Ethernet is not great for AI. But you can make Ethernet great for AI. In the case of the ethernet industry, it's called Ultra Ethernet. So in about three or four years, Ultra Ethernet is going to come, it'll be better for AI. But until then, it's not good for AI. It's a good network, but it's not good for AI. And so we've extended Ethernet, we've added something to it. We call it Spectrum-X that basically does adaptive routing. It does congestion control. It does noise isolation.

你无法改变事实。这有其原因。以太网不是很适合人工智能,这是有原因的。但你可以让以太网适合人工智能。在以太网行业中,这就是所谓的Ultra Ethernet。所以大约三到四年后,Ultra Ethernet将推出,对于人工智能来说会更好。但在那之前,它不适合人工智能。以太网是一个良好的网络,但不适合人工智能。因此,我们扩展了以太网,我们给它添加了一些东西。我们称之为Spectrum-X,它基本上进行自适应路由、拥塞控制、噪音隔离。

Remember, when you have chatty neighbors, it takes away from the network traffic. And AI, AI is not about the average throughput. AI is not about the average throughput of the network, which is what Ethernet is designed for, maximum average throughput. AI only cares about when did the last student turn in their partial product? It's the last person. A fundamentally different design point. If you're optimizing for highest average versus the worst student, you will come up with a different architecture. Does it make sense?


Okay. And because AI has all reduce all to all, all gather, just look it up in the algorithm, the transformer algorithm, the mixture of experts algorithm, you'll see all of it. All these GPUs all have to communicate with each other and the last GPU to submit the answer holds everybody back. That's how it works. And so that's the reason why the networking is such a large impact.


Can you network everything together? Yes. But will you lose 10%, 20% of utilization? Yes. And what's 10% to 20% utilization if the computer is $10,000? Not much. But what's 10% to 20% utilization if the computer is $2 billion? It paid for the whole network, which is the reason why supercomputers are paid -- are built the way they are. Okay.


And so anyways, I showed examples of all these different components and our company creates a platform and all the software associated with it, all the necessary electronics, and then we work with companies and customers to integrate that into their data center, because maybe their security is different, maybe their thermal management is different, maybe their management plane is different, maybe they want to use it just for one dedicated AI, maybe they want to rent it out for a lot of people to do different AI with.


The use cases are so broad. And maybe they want to build an on-prem and they want to run VMware on it. And maybe somebody just wants to run Kubernetes, somebody wants to run Slurm. Well, I could list off all of the different varieties of environments and it is completely mind blowing.

使用案例如此广泛。也许他们想要构建一个本地部署,然后在上面运行 VMware。也许有人只想运行 Kubernetes,有人想运行 Slurm。我可以列举所有不同类型的环境,这简直令人难以置信。

And we took all of those considerations and over the course of quite a long time, we've now figured out how to serve literally everybody. As a result, we could build supercomputers at scale. But basically what NVIDIA does is build data centers. Okay. We break it up into small parts and we sell it as components. People think as a result, we're a chip company.


The third thing that we did was we talked about this new type of software called NIMs. These large language models are miracles. ChatGPT is a miracle. It's a miracle not just in what it's able to do, but the team that put it so that you can interact with ChatGPT in very high response rate. That is a world class computer science organization. That is not a normal computer science organization.


The OpenAI team that's working on this stuff is world class, is a world class team, some of the best in the world. Well, in order for every company to be able to build their own AI, operate their own AI, deploy their own AI, run it across multiple clouds, somebody is going to have to go do that computer science for them. And so instead of doing this for every single model, for every single company, every single configuration, we decided to create the tools and tooling and the operations and we're going to package up large language models for the very first time.


And you could buy it. You could just come to our website, download it and you can run it. And the way we charge you is all of those models are free. But when you run it, when you deploy it in an enterprise, the cost of running it is $4,500 per GPU per year. Basically, the operating system of running that language model.


Okay. And so the per instance, the per-use cost is extremely low. It's very, very affordable. And -- but the benefit is really great. Okay. We call that NIMs, NVIDIA Inference Microservices. You take these NIMs and you're going to have NIMs of all kinds. You're going to have NIMs of computer vision. You're going to have NIMs of speech and speech recognition and text to speech and you're going to have facial animation. You're going to have robotic articulation. You're going to have all kinds of different types of NIMs.

好的。所以每个实例,每次使用的成本非常低廉。非常实惠。而且好处真的非常大。我们把它叫做 NIMs,即 NVIDIA 推理微服务。您拿这些 NIMs,您将拥有各种不同类型的 NIMs。您将拥有计算机视觉的 NIMs。您将拥有语音和语音识别的 NIMs,以及文本转语音的 NIMs。您将拥有面部动画。您将拥有机器关节活动。您将拥有各种不同类型的 NIMs。

These NIMs, the way that you would use it is you would download it from our website and you would fine tune it with your examples. You would give it examples. You say the way that you responded to that question isn't exactly right. It might be right in another company, but it's not right in ours. And so I'm going to give you some examples that are exactly the way we would like to have it. You show it your work products. This is the way -- this is what a good answer looks like. This is what right answer looks like, whole bunch of them.


And we have a system that helps you curate that process that tokenize that, all of the AI processing that goes along with it, all the data processing that goes along with it, fine tuning that, evaluate that, guardrail that so that your AIs are very effective, number one, also very narrow.


And the reason why you want it to be very narrow is because if you're a retail company, you would prefer your AI just didn't pontificate about some random stuff, okay. And so whatever the questions are, it guardrails it back to that lane. And so that guard railing system is another AI. So, we have all these different AIs that help you customize our NIMs and you could create all kinds of different NIMs.

你希望它保持非常狭窄的原因是,如果你是一家零售公司,你更希望你的 AI 不要随意进行某些主题的演讲。无论问题是什么,它都会将其引导回到这个范围。所以这种引导系统是另一个 AI。因此,我们有各种不同的 AI 来帮助你定制我们的 NIMs,你可以创建各种不同的 NIMs。