From the steam engine, electrical to today's Internet, behind every industrial revolution represents the rapid development of science and technology and the great progress of human civilization, now with the fiery development and wide application of generative AI, a new wave of science and technology represented by artificial intelligence is leading the whole society towards the fourth industrial revolution, as a cutting-edge technology that simulates human intelligence, AI can achieve autonomous decision-making and action through independent learning, reasoning and self-correction, etc., and in health care, finance, Transportation, education and other fields play an important role.
In this context, many enterprises hope to seize the dividends of generative AI, actively use generative AI to empower their businesses, and accelerate the process of their own digital and intelligent transformation, but the high development cost and technology investment have also discouraged many enterprises.
Amazon Web Services lowers the barrier to entry for generative AI applications in a wide range of industries, reshaping every industry and changing everyone's life. This includes how to balance scale and cost, truly benefit the business, how to choose the most suitable model for business scenarios, how to customize and act quickly with the enterprise's own data, and of course, how to fully protect the security and privacy of the data under the premise of responsible application generative AI, so Amazon Web Services chooses to continue to invest in the end-to-end three levels of application generative AI. Chen Xiaojian, general manager of the product department of Amazon Web Services Greater China, said.
Self-developed chip innovation consolidates infrastructure
As early as 13 years ago, Amazon Web Services saw the value of GPU accelerated computing chips and became the first cloud service provider to deploy GPUs to the cloud. This shows how much Amazon Web Services attaches importance to the underlying hardware.
In the face of the upsurge of generative AI, Amazon Web Services launched a dedicated chip for generative AI and machine learning training this year, the Amazon Trainium2 processor, which improves performance by up to four times compared with the first generation, while the latency is only 1 10 of the first generation, and has been specially tuned for the training of large models with hundreds of billions or even trillions of parameters.
However, hardware alone is obviously not enough, and the efficient combination of software and hardware can better unleash the potential of hardware, so Amazon Web Services has also launched the Amazon Neuron software development kit, which can help users use customized training and inference chips better and faster at a low cost. In terms of compatibility, Amazon Neuron supports popular ML frameworks such as TensorFlow and PyTorch, and customers can build training and inference pipelines with their existing knowledge to port to these new hardware, often with just a few lines**.
During the Re:Invent 2023 conference, Amazon Web Services and NVIDIA also jointly announced several new collaborations: Amazon Web Services will provide the first cloud AI supercomputer equipped with NVIDIA Grace Hopper superchip and Amazon Web Services Ultra Clusters technologyThe first NVIDIA DGX cloud to use NVIDIA's latest GH200 NVL32 chip is coming to Amazon Web ServicesIn addition, the two companies will work together on the "Project CEIBA" collaboration project to use the world's fastest GPU-powered AI supercomputer and the NVIDIA DGX cloud supercomputer for NVIDIA AI training, R&D, and the development of custom models60,000 of the latest GH200 superchips, providing an astonishing computing power of up to 65 Exaflops.
Finally, in order to improve the efficiency of model training in a distributed environment, Amazon Web Services provides Amazon SageMaker HyperPod, which can automatically manage a large-scale training environment of up to thousands of machines, automatically detect and locate hardware failures, and can replace failed instances and change configurations to automatically avoid these problems. In addition, Amazon SageMaker HyperPod automatically backs up and resumes training from the last checkpoint it was maintained, reducing training time by up to 40%.
Build and scale generative AI with foundational models
For many enterprises, their own AI needs change with business scenarios and industry attributes, and it is almost impossible for a so-called model to go global in the real world. If the model is too complex, it may be overused, and if the model is insufficient, it will waste the cost investment of the enterprise.
That's why Amazon Web Services provides Amazon Bedrock, a fully managed generative AI service for many enterprises, is the easiest way to build and scale generative AI applications with large models. Customers across industries are already using Amazon Bedrock to reinvent their user experiences, products, and processes, and bring AI to the core of their business.
Amazon Bedrock has implemented support for Anthropic Claude 21 and Meta Llama 270B support. The former provides enterprises with advanced key capabilities, including an industry-leading 200k markup context window, while the latter is suitable for large-scale tasks such as language modeling, text generation, and dialogue systems.
In addition, Amazon Titan, the underlying model built by Amazon Web Services and exclusive to Amazon Bedrock, includes a variety of use cases, such as Amazon Titan Text Embeddings, which converts text into vectorsAmazon Titan Text Lite; can support chatbot Q&A or text summarizationAmazon Titan Text Express; for open text generation and conversational chatAmazon Titan Multimodal Embeddings model capable of creating rich multimodal search and recommendation experiences;and Amazon Titan Image Generator, which is capable of generating high-quality images using natural complaint prompts, and many more.
In terms of model customization, Amazon Bedrock supports a variety of customization technologies, including RAG-enhanced retrieval generation through Amazon Titan Text EmbeddingAbility to fine-tune meta llama2, cohere environment, and continuous pre-training.
Continuous pre-training is the use of large amounts of unlabeled data, such as internal reports, financial plans, or research results, and other raw texts, to improve the knowledge and reasoning ability of large models for custom domains. In other words, if you want to have the most effective tool for professional large models in your professional field, it can produce an instance of an actual large model, which can be combined with your data in your private environment for continuous training, which is undoubtedly the most effective way to deeply customize. Chen Xiaojian concluded.
When it comes to model integration, Amazon Bedrock's capabilities enable generative AI applications to perform multi-step tasks across corporate systems and data sources. With this feature, customers can write requirements in natural language after simple settings such as access permissions, and then automatically analyze the request, break it down into logical sequences, and take action accordingly.
Finally, in terms of data security and privacy, Amazon Bedrock does not store any customer data for the training of the underlying model, and all data is encrypted in transit and at rest, and supports standards such as GDPR and HIPAA.
Customize the AI assistant to achieve "out-of-the-box".
For many enterprises, after introducing generative AI into the work scene, they often find that the effect is not satisfactory, because the general generative AI application does not understand the business, data, customers, operations or employees of the enterprise, so the ability will be greatly limited, in response to this situation, Amazon Web Services launched Amazon Q, a generative AI work assistant tailored for enterprise business at this year's Re:Invent 2023.
The launch of Amazon Q is of great significance to enterprises, first of all, it is an expert of Amazon Web Services, trained by the knowledge and experience accumulated by Amazon Web Services for 17 years, and can answer a variety of Amazon Web Services related professional questions raised by customers in a variety of interfaces, such as answering various **related questions of developers in Amazon CodeWhisperer and attaching ** that can be implemented with one click, and providing ** conversion function.
Secondly, it is also an expert in enterprise business, with more than 40 built-in connectors that are compatible with popular data sources and support for custom connectors, which businesses can easily connect to their business data and systems. Amazon Q uses an authentication system to confirm user roles and access rights, and supports administrative controls such as topic blocking or keyword filtering.
Then, it's also a business intelligence expert, supporting the introduction of generative AI-based assistance into multiple services and applications. Bring Amazon Q to Amazon QuickSight, a BI application that responds to user requests in seconds and creates accurate and beautiful monthly descriptions of business changes.
Finally, it's a contact center expert, bringing Amazon Q to Amazon Connect, a cloud contact center application that detects customer issues based on real-time conversations and automates responses, recommendations, and materials.
Of course, the potential of generative AI does not stop there, with the development of the Internet of Things and the popularization of informatization, more and more data will be generated from edge testing in the future, how to quickly integrate data into the data lake and use it for generative AI is an important direction in the future, and Amazon Web Services is also carrying out technological innovation around this direction.
The three-tier work provided by Amazon Web Services for generative AI is not only built for low-level developers, whether it is people who use large model environments, or business people who don't know much about large models and just want to use generative AI, we provide a complete solution. Of course, in the era of generative AI, it is not enough to have a strong model, through the strong data foundation built by Amazon Web Services, enterprises can have comprehensive data capabilities, and enable these capabilities to be connected in different environments, circulated between different products, and achieved governance and management in applications. Chen Xiaojian said at the end.