In the era of rapid growth in the computing power of large language models, there is a fierce battle between the prospect of innovation and the safe and responsible development and use. In this article, we'll provide an overview of the challenges facing large language models, investigate emerging scientific research surrounding these challenges, and dive into what AWS is doing to advance responsible AI efforts.
The technical essence of large language models
Amazon scholar Michael Keynes led the presentation at the forum, explaining how large language model systems work, represented by large language models such as GPT-3. These systems repeatedly generate free-form text by calculating a probability distribution of the next possible word from a given text prompt or context, and then selecting the next word that is actually added to the text through sampling. This impressive technology gives AI tremendous versatility in open text generation.
However, this openness also raises a range of issues for Responsible AI, including challenges such as bias, explainability, visual illusions, toxicity, and intellectual property protection.
The importance and challenges of Responsible AI
In his speech, Keynes highlighted that the openness of large language models has led to a significant increase in the problem of responsible AI. Compared with the previous era of AI, large language models have become more complex to define fairness when dealing with a wide range of inputs and outputs, such as pronoun choice problems. He points out that the randomness of large language models makes it possible for the same prompt to lead to completely different outputs, adding to the challenge of fairness.
In addition to the issue of fairness, Keynes also elaborated on emerging issues such as visual illusions, toxicity, and intellectual property protection. Among them, "hallucination" refers to the ability of a language model to fabricate convincing facts, while toxicity refers to the uncertainty of processing offensive information. In addition, generative models may reveal personal privacy through the generated synonymous sentences, raising concerns about new levels of privacy. In the art field, the use of AI to mimic an artist's style or the author's way of writing can lead to copyright and fair use issues.
Amazon Web Services' Responsible AI practices
Peter Halinan, Head of AI Responsibility at Amazon Web Services, then shared the challenges of translating research into practice. He pointed out that there are key differences between machine learning and traditional software, and that additional attributes such as security, fairness, explainability, and robustness need to be considered. To this end, Amazon Web Services has developed five high-level principles, including defining narrow application use cases, matching processes to risk levels, treating datasets as product specifications, analyzing the impact of different datasets on model performance, and sharing responsibility between vendors and deployers.
Halinan reiterated the importance of Responsible AI throughout the machine learning lifecycle, including problem definition, data collection, model development, testing, and monitoring. He emphasized that Amazon Web Services not only provides training courses, partner networks, and solution architects, but also offers services such as CodeWhisperer and Amazon Titan, which are designed with Responsible AI in mind, including features such as ownership, content filtering, and security scanning.
Future challenges and directions
Although large language models present new challenges, the techniques being investigated, such as data watermarking and reinforcement learning for alignment, have great potential. However, responsibly developing and deploying large language models requires a concerted effort and hard work from teams, companies, academia, and society at large. The future of large language models requires building internal capabilities through awareness, skills training, adoption of emerging best practices, and ultimately operational integration.
Taken together, large language models have great potential, but they also present complex challenges around security, fairness, privacy, and control. Amazon Web Services is committed to translating cutting-edge responsible AI research into practical processes, tools, and services. As we continue to explore deeply, we look forward to seeing more innovative solutions to ensure the continuous development of large language models and the harmonious coexistence of society.