发布于: 雪球转发:0回复:4喜欢:1
回复@张小丰: Thomas总,老黄第二段的核心论点其实是:人工 labeling of data 耗时耗力,以前将 AI 限制在少数行业,最近的变化是能够节省人工 labelling 的成本。

一、第二段前半段的关键词是 transformers,不是无监督啦。无监督作为 transformers 的预训练环节,是作为论据被提到,并不是论点哦。

二、第二段后半段 Omniverse 的【synthetic generated data】相当于【自带 labels】,所以【without having to label data】。即使不用人工 labelling,也能产生大量数据来做监督学习。

---

在老黄发言之前,CFO 也提到了 transformers:


One of the biggest workloads driving adoption of NVIDIA AI is natural language processing, which has been revolutionized by transformer based models. Recent industry breakthroughs traced to transformers include; large language models like GPT-3, NVIDIA Megatron BERT for drug discovery and DeepMind AlphaFold for a protein structure prediction.

Transformers allow self-supervised learning without the need for human labeled data. They enable unprecedented levels of accuracy for TAF such as text generation, translation, summarization and answering questions. To do that, Transformers use enormous training data sets and very large networks well into the hundreds of billions of parameters. To run these giant models without sacrificing low inference times, customers like Microsoft are increasingly deploying NVIDIA AI, including our NVIDIA Ampere architecture-based GPUs and full software stack.//@张小丰:回复@张小丰:One, if you recall a couple two, three years ago, deep learning and AI was starting to accelerate in the most computer science deep companies in the world with CSPs and hyperscalers. And -- but just about everywhere else, it was still quite nascent. And there was a couple of reasons for that.
Obviously, the understanding of the technology is not as pervasive at the time. The type of industrial use cases for artificial intelligence requires labeling of data that's really quite difficult. And then now with Transformers, you have unsupervised learning and other techniques, zero-shot learning that allows us to do all kinds of interesting things without having to have human-labeled data. We even have synthetic generated data with Omniverse that helps customers do data generation without having to label data, which is either too costly or, quite frankly, oftentimes impossible.
And so now, the knowledge and the technology has evolved to a place that most of the industries could use artificial intelligence at a fairly effective way and in many industries rather transformative. And so I think, number one, we went from clouds and hyperscalers to all of industries.
Second, we went from training-focused to inference. Most people thought that inference was going to be easy. It turns out the inference is by far the harder. And the reason for that is because there are so many different models and there are so many different use cases and so many quality of service requirements, and you want to run these inference models in a small of a footprint as you can.
The third dimension is that we now have so many different types of configurations of systems that we can go from high-performance computing systems all the way to cloud to on-prem to edge. And then the final concept is really this industrial deployment now of AI that's causing us to be able to in just about every industry, find growth. And so as you know, our cloud and hyperscalers are growing very, very quickly. However, the vertical part, vertical industries, which is financial services and retail and telco and all of those vertical industries have also grown very, very nicely. And so, in all of those different dimensions, our visibility should be a lot better. And then starting a couple of years ago, adding the Mellanox portfolio to our company, we're able to provide a lot more solution-oriented end-to-end platform solutions for companies that don't have the skills and don't have the technical depth to be able to stand up these sophisticated systems. And so, our networking business is growing very, very nicely as well.
引用:
2022-06-24 12:47
可能有些人没有意识到,抖音已经是全球最大的AI训练项目之一。AI早已无处不在。
There are two new data centers that are coming in, you can tell that are different than all of the other four. The new one that’s coming out is what I call an AI Factory. An AI Factory does one th...

全部讨论

2022-06-25 14:05

self supervised learning (including Transformers)可以被认为是unsupervised learning的一种。 其实怎么分类不重要。重要的是不需要人去标记。将labeling的工作由pretext取代,大大解放了人在数据训练中的工作。