亚马逊,Strong Buy!语音大平台正式诞生!

发布于: 雪球转发:28回复:38喜欢:123


2017年的CES(拉斯维加斯消费电子展)近乎结束,除了无数眼花缭乱的创新电子产品,明星是一批还处于实验阶段的自动驾驶公司,明星是担纲开场演讲的英伟达主席黄仁勋,最耀眼的那颗明星,没有悬念,是无影无形唯听其声的Alexa—亚马逊打造的语音平台。

如果说2016年是以英伟达为代表的人工智能崭露头角的元年,2017年则是亚马逊的语音助手正式成为下一个大平台的元年。



【三大交互界面】

图形
语音
眼球

图形界面盛行多年,我们非常熟悉。在这个平台上,输入的途径是人手。这一平台创造了从PC到智能手机的无数大牛股,包括了惠普戴尔、英特尔微软,以及苹果和谷歌。

语音界面自从2015年萌芽,代表作品是亚马逊的Alexa。去年谷歌和微软加入,分别推出了自己的竞品。而2017年元旦的CES是一个重要时刻,确立了语音界面成为一个平台,人们的输入方式从手转变为嘴,这样的输入方式更加极速和方便。那么谁会是大牛股?

眼球追踪、人脸表情追踪,将会是未来世界的下一个界面。世界已经存在这个技术,比如天文科学家霍金正在使用这项昂贵的技术。我们应该相信终究会有一天,这项技术会普及成为一个交互界面。

发现真理了吗?无论是哪一种交互界面,都在不断地满足人类的懒惰天性和贪图方便:最好不需要动手,最好使用声音,最好眼睛转动,你就能懂得我需要什么。人类与世界交互的方式的进化方式却是人类自身无法阻挡的,因为我们永远不能阻挡技术进步。(最后这句话,请读者你们抄送一千遍到川普的推特。)

【Alexa的萌芽】

我们见证了Alexa的一路走来,借这个机会稍做回顾。亚马逊当初开发这项技术的初心是什么?只是为了想多卖些商品,只是想着好玩,Whatever,今天的Alexa的爆炸性发展完全出乎他们自己的预料,再一次印证了技术常常像地球的进化史,某个生物的基因突变可能是完全无用的,也可能是巨大的进步。当然,Alexa的诞生并不见得有那么多偶然性,她背后站着贝索斯,是他的前瞻性的眼光和偏执的要求使Alexa的技术达到了世界最顶尖的水平。

美国的产业界有少数人士(比如苹果前CEO和创始人)提前预见到了Alexa的宏大前景,笔者过去一年多也关注并期待这项技术,这里集合了几篇当时的作文:

《Don't miss it!或许是下一个万亿平台!》
网页链接

《亚马逊是怎样创造了下一个万亿平台的产品—Echo》
网页链接

萌芽初期,Alexa仅仅集成了几百项功能,许多美国人还不太知道这是什么玩意。一年多后的现在,Alexa植入接近7000项目功能。这里有个链接上了许多直观的图片:网页链接。当然还有很多看不见的服务比如打车、点播乐曲、叫外卖、查股票账户等等等等都是无法显示在这里。在本次CES展会上,各行各业的厂家以极高的热情拥抱Alexa这个平台,这让我们回想起几年前苹果诞生的时候,今天苹果操作系统容纳了220万个APP,那么Alexa这个语音平台的未来会是怎样?用我们的想象力吧。

【变现】

今时今日,亚马逊的股价甚至还没有包含Alexa平台的估值,因为还有一些疑虑存在,怎么变现?

不变现,不需要急切变现,将来再去变现,我想这是亚马逊的态度,也是苹果和安卓系统最初的态度。收割入口流量胜于一切,获取海量数据胜于一切,让数据加强人工智能深度学习胜于一切。这样的商业模式已经屡试不爽。尽管如此,Alexa已经附带一个硬件产品叫做Echo,短短两年不到已经售出500万台,步入美国千家万户,并走向英国德国和澳洲市场。在刚过去的圣诞节,Echo卖到断货,同比销售数量增长九倍。以这个速度不久的将来我们会看到Echo出货量达到2000万甚至更多。这就是Alexa的变现。

同时,Alexa作为技术输出到各种硬件和服务,亚马逊或许会收取一些授权费用,这个部分笔者相信暂时是稀少的。我们看到现代劳恩斯高端汽车提供Alexa语音服务,该服务会收取用户每月60元的费用,那么我们有理由猜测诸如此类的费用会由亚马逊和终端厂家分享,这个案例启发我们Alexa的变现之路将来会很丰富。

对于亚马逊自身来说,除了上述两个路径变现,他们当下更关心主题是他们当初开发这个技术的目标:卖货。实际情况怎么样呢?



如图,美国用户使用Echo还是用许多基础而简单的功能比较多,毕竟这是一个新产品,她的功能丰富化还只是去年至今的一段爆发期。但是,很明显的是“加入购物清单”、“直接购物”、“买付费音乐”都达到了32%-45%的高使用率。甚至,为了减少用户的选择,亚马逊直接设计了一款最佳性价比的产品叫做Amazon Choice。比如对于那些懒于筛选商品的用户,只要说一声“想买个五号电池,给推荐一个吧”,Alexa就会推送这款产品并直接下单购买。


当用户使用惠而浦的最新植入Alexa技术的洗衣机时,突然发现洗衣液用完了,只需要叫唤Alexa“买一大包洗衣液”,这个购买就完成了。

这是一个强大无比的生态圈,紧密地围绕着亚马逊电子商务,而背后它所推荐的商品也可能附带竞价广告的价值,同时增加了亚马逊自家的支付Amazon Pay的使用,也间接增加了亚马逊自己的物流服务FBA的使用率。未来,会有多少千亿美元的商品和服务是透过语音平台卖出去的?亚马逊又会从中渔利多少?用我们的想象力吧。


可以说,创造Echo,亚马逊的初心已经达成。现在所发生的一切,已经超越了他们的初心。

【扩张速度】

既然这么美,Alexa这项技术在全球的传播速度会像iPhone那么快吗?如果是,笔者可以负责任地说,亚马逊很快将会增添千亿美元市值。然而,答案是不会那么快。Alexa是基于英语语种的技术,也基于本地云服务支撑的技术,因此她首先会在英语国家扩张,然后逐步渗透其他语种,这个过程将比iPhone缓慢。

在这个过程中,世界两大语种:英语和中文,分别占据了全球GDP的35%和18%左右。Alexa能够先把英语地区做好就相当不错。在中国,百度已经在本届CES借势发布了自己的类Echo产品,取名小鱼。或许百度将是中国市场的首要赢家。本文不做冗述。

【竞争】

谷歌的Googel Home和微软的Cortina已经成为竞争对手。竞争是不可避免的。这里,笔者必须提出亚马逊的云服务AWS,至今呈现出压倒性的优势,在市场份额上超过所有竞争对手的总和。一旦领先,举世难追,皆因领先的哥,并没有躺倒睡大觉。在科技行业,往往出现这样的规律。当然这不是绝对的。我们期望亚马逊在语音技术上的领先和在云服务上的领先一样能够一直保持。本次的CES大会无论从人气还是植入数量,Alexa都显示出超强的力量,谷歌的竞品只植入了一家第三方硬件,而Alexa植入了数不清的产品。尤其是中国华为在旗舰手机Mate9中植入了Alexa,而这款手机的操作系统是谷歌的安卓系统,华为的选择体现了用户的选择。

另外,由于语音搜索让用户免于观看广告,这一点造成了语音平台对谷歌根基业务的威胁。笔者预计,在一个时期内,图形界面和语音界面会共存,但同时会出现新老替代,传统势力和新兴势力的此消彼长会冲击各家公司的业绩,本文不做冗述,我们将带着莫大的兴趣去观察未来,看一看又一个“诺基亚时刻”或者“IOS vs Android时刻”来临。这也正是技术替代最让人激动人心的地方。

【未来】

我们还能展望语音技术本身有多少提高?我期望,甚至也相信,类似于Alexa这样的语音识别技术会做到区分人类的情绪(亚马逊已经在做研究)、区分男女老幼从而避免小孩错误使用产品,进一步区分每一个人的声纹以便做到像指纹一样锚定每一个个体,那样将不需要使用“Alexa”这个词语去唤醒她,因为只要人们发声那就是唯一的特征,语音识别将会服务于每一个人,没有错误,没有密码,没有隐私泄露,只有专属识别。那将是语音平台技术的终极目标。

【多支柱支撑的亚马逊帝国】

最后,诚如我们过去所看到的,亚马逊公司的多支柱业务发展至无远弗届:电商、Prime Video、物流、云服务、Alexa、实体店。这个帝国的形成,仅云服务这条支柱让市场给与它1000亿美元的估值。Prime会员已经拥有50亿收入,Prime Video被作为独立服务刚刚宣布进军200个国家。现在,有了Alexa,最终她会被估值。将来,还有会Amazon Go那个不需要结账流程的实体零售店,最终那也要被估值。

在众所周知的政治冲击下,亚马逊等科技股向下跌了大约10%,此刻的Strong buy更显理性,更显机遇。Strong buy,还因为它是亚马逊,它是未来。它的股价会去到多少?用我们的想象吧。



$亚马逊(AMZN)$ @方舟88

精彩讨论

forcode2017-01-10 09:14

Amazon alexa这类语音控制终端未来十年内应该会普及到千家万户,成为智能家居和智能汽车的语音控制中枢,但商业潜力远不如智能手机,可能是一个数十亿美元的小市场而已。用语音平台购物,我觉得太不靠谱了。

洁恩达生2017-01-09 23:26

$科大讯飞(SZ002230)$情何以堪!

全部讨论

2017-01-10 08:45

2017-01-10 07:03

语音

2017-01-10 02:02

$MATTERSIGHT CORP(MATR)$ 做语音情绪识别的,目前应用于帮助呼叫中心的客服配对。因为发展不理想,今年已腰斩。但看了这篇文章感觉其技术其实结合语音识别机器人可以有不错的结合有想象力。目前股价也是多年低点…

2017-01-09 23:53

本周The Economist封面文章:
Now we’re talking

How voice technology is transforming computing

Like casting a magic spell, it lets people control the world through words alone

 Jan 7th 2017

ANY sufficiently advanced technology, noted Arthur C. Clarke, a British science-fiction writer, is indistinguishable from magic. The fast-emerging technology of voice computing proves his point. Using it is just like casting a spell: say a few words into the air, and a nearby device can grant your wish.

The Amazon Echo, a voice-driven cylindrical computer that sits on a table top and answers to the name Alexa, can call up music tracks and radio stations, tell jokes, answer trivia questions and control smart appliances; even before Christmas it was already resident in about 4% of American households. Voice assistants are proliferating in smartphones, too: Apple’s Siri handles over 2bn commands a week, and 20% of Google searches on Android-powered handsets in America are input by voice. Dictating e-mails and text messages now works reliably enough to be useful. Why type when you can talk?

This is a huge shift. Simple though it may seem, voice has the power to transform computing, by providing a natural means of interaction. Windows, icons and menus, and then touchscreens, were welcomed as more intuitive ways to deal with computers than entering complex keyboard commands. But being able to talk to computers abolishes the need for the abstraction of a “user interface” at all. Just as mobile phones were more than existing phones without wires, and cars were more than carriages without horses, so computers without screens and keyboards have the potential to be more useful, powerful and ubiquitous than people can imagine today.

Voice will not wholly replace other forms of input and output. Sometimes it will remain more convenient to converse with a machine by typing rather than talking (Amazon is said to be working on an Echo device with a built-in screen). But voice is destined to account for a growing share of people’s interactions with the technology around them, from washing machines that tell you how much of the cycle they have left to virtual assistants in corporate call-centres. However, to reach its full potential, the technology requires further breakthroughs—and a resolution of the tricky questions it raises around the trade-off between convenience and privacy.

Alexa, what is deep learning?

Computer-dictation systems have been around for years. But they were unreliable and required lengthy training to learn a specific user’s voice. Computers’ new ability to recognise almost anyone’s speech dependably without training is the latest manifestation of the power of “deep learning”, an artificial-intelligence technique in which a software system is trained using millions of examples, usually culled from the internet. Thanks to deep learning, machines now nearly equal humans in transcription accuracy, computerised translation systems are improving rapidly and text-to-speech systems are becoming less robotic and more natural-sounding. Computers are, in short, getting much better at handling natural language in all its forms (see Technology Quarterly).

Although deep learning means that machines can recognise speech more reliably and talk in a less stilted manner, they still don’t understand the meaning of language. That is the most difficult aspect of the problem and, if voice-driven computing is truly to flourish, one that must be overcome. Computers must be able to understand context in order to maintain a coherent conversation about something, rather than just responding to simple, one-off voice commands, as they mostly do today (“Hey, Siri, set a timer for ten minutes”). Researchers in universities and at companies large and small are working on this very problem, building “bots” that can hold more elaborate conversations about more complex tasks, from retrieving information to advising on mortgages to making travel arrangements. (Amazon is offering a $1m prize for a bot that can converse “coherently and engagingly” for 20 minutes.)

When spells replace spelling

Consumers and regulators also have a role to play in determining how voice computing develops. Even in its current, relatively primitive form, the technology poses a dilemma: voice-driven systems are most useful when they are personalised, and are granted wide access to sources of data such as calendars, e-mails and other sensitive information. That raises privacy and security concerns.

To further complicate matters, many voice-driven devices are always listening, waiting to be activated. Some people are already concerned about the implications of internet-connected microphones listening in every room and from every smartphone. Not all audio is sent to the cloud—devices wait for a trigger phrase (“Alexa”, “OK, Google”, “Hey, Cortana”, or “Hey, Siri”) before they start relaying the user’s voice to the servers that actually handle the requests—but when it comes to storing audio, it is unclear who keeps what and when.

Police investigating a murder in Arkansas, which may have been overheard by an Amazon Echo, have asked the company for access to any audio that might have been captured. Amazon has refused to co-operate, arguing (with the backing of privacy advocates) that the legal status of such requests is unclear. The situation is analogous to Apple’s refusal in 2016 to help FBI investigators unlock a terrorist’s iPhone; both cases highlight the need for rules that specify when and what intrusions into personal privacy are justified in the interests of security.

Consumers will adopt voice computing even if such issues remain unresolved. In many situations voice is far more convenient and natural than any other means of communication. Uniquely, it can also be used while doing something else (driving, working out or walking down the street). It can extend the power of computing to people unable, for one reason or another, to use screens and keyboards. And it could have a dramatic impact not just on computing, but on the use of language itself. Computerised simultaneous translation could render the need to speak a foreign language irrelevant for many people; and in a world where machines can talk, minor languages may be more likely to survive. The arrival of the touchscreen was the last big shift in the way humans interact with computers. The leap to speech matters more.

2017-01-09 23:35

讨论已被 chrisjiang2002 删除

2017-01-09 23:26

$科大讯飞(SZ002230)$情何以堪!