An unmarked office building in Austin, Texas, has two small rooms containing a small amount of information. Amazon Employees are designing two types of microchips for training and accelerating generative AI. These custom chips, Inferentia and Trainium, offer AWS customers an alternative to train language models at scale. Nvidia GPUs are difficult and expensive procure.
“The whole world is running generative AI, whether it’s GPUs or Amazon’s own chips we’re designing,” Amazon Web Services CEO Adam Seripsky told CNBC in an interview in June. I want more chips to do.” “I think we are better positioned than anyone else on the planet to deliver the capacity that our customers would collectively want.”
Additionally, some companies have moved faster and invested more to capture business from the generative AI boom. When OpenAI launched ChatGPT in November, microsoft It gained widespread attention for hosting a viral chatbot and reportedly investing $13 billion in OpenAI. The company quickly added generative AI models to its product, and built it into his Bing in February.
same month, Google launched its own large-scale language model, Bard, followed by a $300 million investment in OpenAI rival Anthropic.
It wasn’t until April that Amazon announced its own family of large-scale language models called Titan and a service called Bedrock that allows developers to use generative AI to enhance their software.
“Amazon is not used to chasing markets. Amazon is used to creating markets. said Chirag Decate. Vice President Analyst at Gartner.
meta We also recently released our own LLM. llama 2. The open-source ChatGPT competitor is now available for testing in Microsoft’s Azure public cloud.
Chips as a “true differentiator”
In the long term, Dekate said Amazon’s custom silicon could give it an edge in generative AI.
“I think the real differentiation is the technical capabilities they have,” he says. “Because Microsoft doesn’t have him Trainium or Inferentia,” he said.
AWS quietly started producing custom silicon in 2013 using specialized hardware called Nitro. This is currently the largest amount of chips in AWS. Amazon told CNBC that he has at least one in all his AWS servers, with more than 20 million in total.
AWS began producing custom silicon in 2013 using specialized hardware called Nitro. Amazon told CNBC in August that Nitro is currently the largest AWS chip, with at least one in every AWS server, with more than 20 million total.
Provided by Amazon
In 2015, Amazon acquired Israeli semiconductor startup Annapurna Labs.And in 2018, Amazon Arm-based server chips, Graviton, are rivals to x86 CPUs from major companies such as: AMD and intel.
“Probably high single-digit to maybe 10% of total server sales will be Arm, and a good chunk of that will be Amazon. So on the CPU side, they’re doing very well,” the company said. senior analyst Stacey Rasgon said. Bernstein research.
Also in 2018, Amazon launched an AI-focused chip. This came two years after Google announced its first Tensor Processor Unit (TPU). Microsoft has yet to announce the Athena AI chip it is developing in partnership with AMD.
CNBC took a behind-the-scenes tour of Amazon’s chip lab in Austin, Texas, where Trainium and Inferentia are being developed and tested. VP of Product Matt Wood explained the uses for both chips.
“Machine learning is divided into two distinct phases: training machine learning models and then running inference on those trained models,” says Wood. “Trainium provides about a 50% improvement in price-performance compared to other methods of training machine learning models on AWS.”
Trainium will hit the market for the first time in 2021, following the release of Inferentia in 2019. Inferentia is now in its second generation.
With Trainum, customers can provide “extremely low-cost, high-throughput, low-latency machine learning inference, which is all predictions when you type a prompt into a generative AI model, where Everything will be taken care of, answer me,” said Wood.
For now, though, Nvidia GPUs are still king when it comes to training models. AWS in July launched New AI acceleration hardware powered by Nvidia H100s.
“NVIDIA chips have built a large software ecosystem over the past 15 years or so that no other chip has,” said Rasgon. “At the moment he’s the biggest winner in the AI space he’s Nvidia.”
Amazon’s custom chips (left to right: Inferentia, Trainium, Graviton) are on display at Amazon’s Seattle headquarters on July 13, 2023.
Joseph Huerta
Leverage cloud advantages
However, AWS’ cloud dominance is a major differentiator for Amazon.
“Amazon doesn’t need to make headlines. Amazon already has a very strong cloud install base. You just have to find a way,” says Dekate.
There are millions of AWS customers who are drawn to Amazon when it comes to choosing between Amazon, Google, and Microsoft for their generative AI. Because they are already familiar with Amazon and have other applications running and storing data on Amazon.
“It’s a question of speed. How quickly these companies can move to developing these generative AI applications starts with the data they have in AWS and then leverages the computing and machine learning tools we provide. It depends on what you use,” explained Mai-Lan Tomsen Bukovec. Vice President of Technology at AWS.
AWS will be the world’s largest cloud computing provider with 40% market share by 2022, according to technology industry researchers Gartner. Operating profit declined year-over-year for the third consecutive quarter, but AWS still accounted for 70% of Amazon’s total operating profit of $7.7 billion in the second quarter. AWS’ operating margins have historically been much wider than Google Cloud’s operating margins.
AWS is also growing Portfolio of developer tools Focuses on generative AI.
“Let’s turn the clock back on before ChatGPT. Not after it happened. All of a sudden, we had a quick plan because we can’t design a chip in that short time, let alone Bedrock Services. “It takes two to three months,” said Swami Sivasubramanian, vice president of databases, analytics and machine learning at AWS.
Bedrock gives AWS customers access to large-scale language models created by Anthropic, Stability AI, AI21 Labs, and Amazon’s own Titan.
“We don’t believe that one model will rule the world. We provide cutting-edge models from multiple providers so that our customers can choose the right tool for the right job. I would like to,” said Sivasbramanian.
Amazon employees wear jackets branded with AWS’s Chip Inferentia as they work on custom AI chips at the AWS Chip Labs in Austin, Texas, July 25, 2023.
Katie Tarasoff
One of Amazon’s newest AI offerings is AWS HealthScribe. The service, which uses generative AI to help doctors create patient visit summaries, was announced in July. Amazon also has SageMaker, a machine learning hub that provides algorithms, models, and more.
Another great tool is the coding companion CodeWhisperer. According to Amazon, this allows developers to: Complete tasks 57% faster on average.Last year, Microsoft also reported Increased productivity From my coding buddy GitHub Copilot.
In June, AWS announced a $100 million generative AI innovation “center.”
“We have a lot of customers saying, ‘I want to do generative AI,’ but they don’t always know what that means in terms of their business. “Engineers, strategists and data scientists work one-on-one,” said AWS CEO Selipsky.
So far, AWS has mostly focused on tools rather than building ChatGPT competitors, but more recently Internal email leaked It also shows that Amazon CEO Andy Jassy is directly overseeing a new central team building scalable large-scale language models.
in the second quarter financial statementJassy said a “substantial amount” of AWS’ business is now driven by AI and the more than 20 machine learning services the company offers. Examples of customers include Philips, 3M, Old Mutual and HSBC.
With the explosive growth of AI, there has been a flurry of security concerns from companies worried that their employees are putting sensitive information into the training data used by large-scale publicly available language models. Masu.
“I don’t know how many Fortune 500 companies I’ve spoken to that have banned ChatGPT. The virtual private cloud environment is encrypted and the same AWS access controls apply,” said Selipsky.
For now, Amazon is only accelerating its generative AI efforts, telling CNBC that it currently has “over 100,000” customers using machine learning on AWS. That’s just a fraction of AWS’ millions of customers, but analysts say that could change.
“What we don’t see is companies saying, ‘Wait a minute, Microsoft is so advanced in generative AI, let’s just go out there and switch infrastructure strategies and move everything to Microsoft.’ “If you’re already an Amazon customer, you’re probably going to be exploring the Amazon ecosystem pretty extensively,” Dekate said.
— CNBC’s Jordan Novet contributed to this report.