With a number of new features, generative AI is undoubtedly the protagonist of Amazon Web Services r

Mondo Technology Updated on 2024-01-28

Amazon Web Services is always refactoring to drive innovation, and this year's re:invent is undoubtedly the protagonist of generative AI. This is evident from the fact that Jeff Barr, VP and Chief Evangelist of Amazon Web Services, couldn't wait to write a playable tutorial about PartyRock before Re:Invent 2023.

And that's exactly what happened. At Las Vegas, everyone from individual developers, enterprise architects, and university professors, from traditional industries to innovative companies, is talking about the upcoming innovation of Amazon Web Services by refactoring generative AI.

Debuted Amazon Web Services generative AI technology stack.

In the latest release of Re:Invent 2023, Amazon Web Services released the generative AI technology stack for the first time, including the infrastructure layer responsible for training and inference, the tool service layer for fine-tuning model requirements in the middle, and the upper layer for building generative AI applications. Through the continuous reconstruction of these three levels, Amazon Web Services can provide users with more cost-effective and secure technologies and services.

Our unique generative AI stack provides customers with a significant advantage over other clouds. In this regard, Amazon Web Services CEO Adam Selipsky said, "Not all competitors choose to innovate at every layer."

Self-research + cooperation, continuous innovation at the infrastructure layer.

Generative AI doesn't create value alone, it needs hardware. Cost-effective infrastructure is key to building generative AI applications. Adam Selipsky has emphasized the importance of innovation in this area in an interview, "The next generation of AI workloads is very compute-intensive, so price/performance is absolutely critical. ”

The new generation of Amazon Trainium2 chip was released.

Previously, Amazon Web Services has proven the value of innovation in this field through multiple generations of GR**ITON, Trainium, and Inferentia's self-developed chips. This year, Amazon Web Services brought a new generation of Amazon GR**iton4 and Amazon Trainium2 chips as promised. Among them, the Amazon Trainium2 chip is specially built for high-performance training of basic models, which improves performance by up to 4 times, memory by 3 times, and energy efficiency by up to 2 times compared with the previous generation.

Trainium2 instances can scale up to 100,000 chips in EC2 Ultraclusters, provide up to 65 Exaflops of computing power, train base models (FMS) and large language models (LLMS) in a fraction of the time, and support on-demand performance.

Amazon Web Services revealed that with this level of scale, the training of a large language model with 300 billion parameters will be shortened from months to weeks. Anthropic, a star generative AI company, is planning to use Trainium2 to train the next generation of complex Claude models.

First to launch NVIDIA GH200 NVL32 instances.

Also in the spotlight is the collaboration with NVIDIA. Jensen Huang, founder and CEO of NVIDIA, and Adam Selipsky announced the expansion of their strategic collaboration to jointly launch advanced infrastructure, software and services to drive customer innovation in generative AI. The cooperation includes:

Amazon Web Services launches the first cloud AI supercomputer to combine NVIDIA Grace Hopper SuperChip with Amazon UltraCluster extensions

NVIDIA DGX Cloud is the first to feature NVIDIA GH200 NVL32 and is the first to offer this AI Training-as-a-Service on Amazon Web Services

NVIDIA and Amazon Web Services have partnered on Project CEIBA to build the world's fastest GPU-powered AI supercomputer and the latest NVIDIA DGX Cloud supercomputer for NVIDIA's AI R&D and self-developed model development services

New Amazon EC2 instances with NVIDIA GH200, H200, L40S, and L4 GPUs boost performance for generative AI, HPC, design, and simulation workloads

Nvidia software running on Amazon Technologies, including the Nemo LLM framework, Nemo Retriever, and Bionemo, accelerates the development of generative AI for applications such as self-developed models, semantic search, and new drug discovery.

This means that Amazon Web Services will be the first cloud vendor to equip NVIDIA GH200 Grace Hopper superchip in the cloud and will launch NVIDIA DGX Cloud NVIDIA AI "training as a service" on its platform. In addition, the two companies will work together to build the world's fastest GPU-powered AI supercomputer based on Project CEIBA, as well as more cloud instances based on NVIDIA chips.

Amazon Bedrock has released new features and evolved the tool service layer.

As a playground for building shareable generative AI applications based on Amazon Bedrock, PartyRock is just an appetizer for Re:Invent. Behind this, the idea that everyone can build is what excites all developers.

Amazon Bedrock, the culmination of this spirit, has launched Anthropic Claude 2 in addition to Amazon Titan Family1. Meta Llama 2 70b and Stability AI Stable Diffusion XL 10 and other latest versions of almost all industry-leading models, as well as a wide range of features needed to build generative AI.

Dr. Swami Sivasubramanian, vice president of data and artificial intelligence at Amazon Web Services, summed up the value of Amazon Bedrock's innovation, "Industries are integrating generative AI into their businesses, but there is no one big model that fits all. With Amazon Bedrock, customers can choose any suitable model for rapid innovation. ”

Amazon Bedrock is getting two major feature updates.

At Re:Invent 2023, Amazon Bedrock brought comprehensive updates, including model fine-tuning, retrieval enhanced generation (RAG), and pre-training based on Amazon Titan's large models, as well as the official launch of Agents and GuardRails, further lowering the barrier to entry for building generative AI applications.

With the convenience of agents, developers can easily and conveniently enable generative AI applications to perform multi-step tasks such as processing sales orders across company systems and data sources without having to design prompts, manage session context, or manually orchestrate systems with natural language instructions.

In terms of security experience, to provide a question-and-answer user experience and make the use of generative AI technologies more secure, developers can leverage GuardRails capabilities to provide a consistent level of AI security across all applications across the underlying model, enforcing key policies and rules in generative AI applications in a simplified way and enforcing cross-model protections. This is not limited to the Titan model of Amazon Web Services, but also applies to other models on Bedrock.

Diya Wynn, Senior Practice Manager of Responsible AI Ethics at Amazon Web Services, noted in an interview, "Security is paramount. When we think about AI, responsible AI should be designed with in mind. If we wait until the afterthought to consider these factors, it can lead to some catastrophic effects. ”

Amazon Bedrock was designed from the ground up with responsibility in mind. Customer data is encrypted in transit and at rest, so all valuable customer data is always secure and private, and cannot be used by Amazon Web Services and third-party providers to train the underlying model.

At present, Amazon Bedrock has served tens of thousands of users, and enterprises such as Salesforce and MongoDB have taken the lead in using Amazon Bedrock to apply generative AI.

Amazon Q was released to improve the application layer layout.

Amazon Q is one of the most exciting launches at Re:Invent 2023. Unlike ChatGPT, which is a general-purpose chatbot for the C-side, Amazon Q is designed for office scenarios and provides employees with information and advice to help them streamline tasks, accelerate decision-making, and solve problems, thereby driving enterprise innovation. This not only represents Amazon Web Services' official entry into the chatbot race, but also kicks off the era of enterprise-grade generative AI.

Amazon Q changes the way developers and IT build and deploy applications and workloads on Amazon Web Services. Customers can access the chat interface through the Amazon Web Services Management Console, document pages, IDEs, Slack, or other third-party conversation applications, and business content is never used to train the underlying model. Dr. Swami Sivasubramanian said, "Amazon Q is a powerful addition to the application layer of our generative AI stack, opening up new possibilities for every organization. ”

Amazon Q offers a high degree of flexibility. Today, developers often need to put a lot of effort and effort into keeping up with the iteration of generative AI technology, quickly designing and delivering new features, managing the end-to-end lifecycle of applications and workloads, and balancing priorities between maintaining existing products and building new features. Amazon Q fully supports customization to the customer's business to help enterprise developers focus on the development itself.

Adam Selipsky showed an example of a high-performance encoding and transcoding application. When asked which EC2 instance is best suited for the current use case, Amazon Q has a checklist that covers performance versus cost factors.

Amazon Q also has an excellent conversion capability. Previously, a team of five Amazon developers used Amazon Q** Conversion to upgrade 1,000 production applications from J**A 8 to J**A17 in just two days, with an average upgrade time of less than 10 minutes per application.

In addition, Amazon Q can be combined with Amazon CodeCatalyst to generate tests for users across supported IDEs to measure their quality and accelerate feature development.

However, due to the compliance policy, although he said, "I firmly believe that this will be a productivity change, and I hope that people from different industries and in different roles can benefit from Amazon Q", it will still take time for Amazon Q to enter China.

Re:Invent 2023 has come to an end, and the era of generative AI single-point competition is over. With the unveiling of the landscape of the Amazon Web Services generative AI technology stack, an era in which AI benefits everyone is on the horizon.

Disclaimer: The market is risky, choose carefully!This article is for reference only and is not intended as a basis for trading.

Related Pages