Amazon Web Services (AWS) this week announced innovations across its machine learning portfolio to make generative artificial intelligence more accessible to customers.
Generative AI is a type of AI that can create new content and ideas, including conversations, stories, images, videos, and music. Like all AI, generative AI is powered by ML models—very large models that are pre-trained on vast amounts of data and commonly referred to as Foundation Models (FMs). Recent advancements in ML (specifically the invention of the transformer-based neural network architecture) have led to the rise of models that contain billions of parameters or variables.
To give a sense for the change in scale, the largest pre-trained model in 2019 was 330M parameters. Now, the largest models are more than 500B parameters—a 1,600x increase in size in just a few years. FMs, such as the large language models (LLMs) GPT3.5 or BLOOM, and the text-to-image model Stable Diffusion from Stability AI, can perform a wide range of tasks that span multiple domains, like writing blog posts, generating images, solving math problems, engaging in dialog, and answering questions based on a document. The size and general-purpose nature of FMs make them different from traditional ML models, which typically perform specific tasks, like analysing text for sentiment, classifying images, and forecasting trends.
The AWS announcements include:
- Amazon Bedrock: Easily build generative AI applications – Amazon Bedrock is a new service for building and scaling generative AI applications, which are applications that can generate text, images, audio, and synthetic data in response to prompts. Amazon Bedrock gives customers easy access to foundation models (FMs)—those ultra-large ML models that generative AI relies on—from the top AI startup model providers, including AI21, Anthropic, and Stability AI, and exclusive access to the Titan family of foundation models developed by AWS. No single model does everything. Amazon Bedrock opens up an array of foundation models from leading providers, so AWS customers have flexibility and choice to use the best models for their specific needs.
- General availability of Amazon EC2 Inf2 instances powered by AWS Inferentia2 chips: Lowering the cost to run generative AI workloads – Ultra-large ML models require massive compute to run them. AWS Inferentia chips offer the most energy efficiency and the lowest cost for running demanding generative AI inference workloads (like running models and responding to queries in production) at scale on AWS.
- New Trn1n instances, powered by AWS Trainium chips: Custom silicon to train models faster –Generative AI models need to be trained so they offer the right answer, image, insight, or other focus the model is tackling. New Trn1n instances (the server resource where the compute happens, and in this case, runs on AWS’s custom Trainium chips) offer massive networking capability, which is key for training these models quickly and in a cost-efficient manner.
- Free access to Amazon CodeWhisperer for individual developers: Real-time coding assistance
The last may be the most significant for developers.
Imagine being a software developer with an AI-powered coding companion, making your coding faster and easier. Amazon CodeWhisperer does just that. It uses generative AI under the hood to provide code suggestions in real time, based on a user’s comments and their prior code. Individual developers can access Amazon CodeWhisperer for free, without any usage limits (paid tiers are also available for professional use with features like added enterprise-level security and administrative capabilities).
Swami Sivasubramanian, VP of databases, analytics, and machine learning art AWS, says: “We know that building with the right FMs and running Generative AI applications at scale on the most performant cloud infrastructure will be transformative for customers. The new wave of experiences will also be transformative for users. With generative AI built-in, users will be able to have more natural and seamless interactions with applications and systems. Think of how we can unlock our mobile phones just by looking at them, without needing to know anything about the powerful ML models that make this feature possible.”
One area where Sivasubramanian foresees the use of generative AI growing rapidly is in coding:
“Software developers today spend a significant amount of their time writing code that is pretty straightforward and undifferentiated. They also spend a lot of time trying to keep up with a complex and ever-changing tool and technology landscape. All of this leaves developers less time to develop new, innovative capabilities and services. Developers try to overcome this by copying and modifying code snippets from the web, which can result in inadvertently copying code that doesn’t work, contains security vulnerabilities, or doesn’t track usage of open source software. And, ultimately, searching and copying still takes time away from the good stuff.
“Generative AI can take this heavy lifting out of the equation by ‘writing’ much of the undifferentiated code, allowing developers to build faster while freeing them up to focus on the more creative aspects of coding.”