Getting Started with Amazon Bedrock
Amazon Bedrock:
It is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon via a single API.
Why Amazon Bedrock?
Using Amazon Bedrock, you can easily experiment with and evaluate top FMs for your use case, privately customize them with your data using techniques such as fine-tuning and Retrieval Augmented Generation (RAG), and build agents that execute tasks using your enterprise systems and data sources.
Since Amazon Bedrock is Serverless, you don't have to manage any infrastructure, and you can securely integrate and deploy generative AI capabilities into your applications using the AWS services you are already familiar with.
What is FM?
FM stands for Foundational Model. Trained on massive datasets, A foundation model (FMs) is a type of machine learning (ML) model that is pre trained to perform a range of tasks.
What is RAG?
RAG stands for Retrieval Augmented Generation. It is an AI framework for retrieving facts from external Sources to increase the accuracy of the information generated by the LLM.
When the LLMs have insufficient data about the user’s request, it generates a random less accurate value which can match that prompt. It is also called as “AI Hallucination”.??
Available Models:
Use Cases Of Bedrock:
1. Text Generation:
Create new pieces of original content such as Essays, Short Stories or can also give you some great suggestions.
Say “suggest me some Tamil movies to watch. The genre should be romance”
Chatbots:
Build Conversational interfaces such as Chatbots and Virtual Assistants to enhance the UX for the users.
Search:
Image Generation:
Tokens:
??????Token limits are restrictions on the number of tokens that an LLM can process in a single interaction. A token is a unit of text that is used to represent a word, phrase, or other piece of text. For example, the phrase "I love you" would consist of 5 tokens: "I", "love", "you", ".", and " ".
Why are token limits relevant?
Token limits are relevant because they can affect the performance of LLMs. If the token limit is too low, the LLM may not be able to generate the desired output.
For example, if you are trying to generate a 1000-word document but the token limit is 1000, the LLM will only be able to generate the first 1000 tokens.
Whereas if it’s too high, the LLM is going to be very slow and require very high computational power.
Now let’s see the various model versions of the models provided by Amazon Bedrock.
====================================================
Amazon Titan:
Titan Text Express
LLM offering a balance of price and performance.
Max tokens: 8K
Languages: English (GA), 100+ languages available (Preview)
Fine-tuning supported: Yes
Supported use cases: Retrieval augmented generation, open-ended text generation, brainstorming, summarization, code generation, table creation, data formatting, paraphrasing, chain of thought, rewrite, extraction, Q&A, and chat.
Titan Text Lite
Cost-effective and highly customizable LLM. Right-sized for specific use cases, ideal for text generation tasks and fine-tuning.
Max tokens: 4K
Languages: English
Fine-tuning supported: Yes
Supported use cases:? Summarization and copywriting.
Titan Text Embeddings
LLM that translates text into numerical representations.
Max tokens: 8K
Languages: 25+ languages
Fine-tuning supported: No
Embeddings: 1,536
Supported use cases: Text retrieval, semantic similarity, and clustering.
Titan Multimodal Embeddings
Powers accurate multimodal search and recommendation experiences.
Max tokens: 128
Max images size: 25 MB
Languages: English
Embeddings: 1,024 (default), 384, 256
Fine-tuning supported: Yes
Supported use cases: Search, recommendation, personalization
Titan Image Generator
Generate realistic, studio-quality images using text prompts.
Max tokens: 77
Max input image size: 25 MB
Languages: English
Fine-tuning supported: Yes
Supported use cases: Text to image generation, image editing, image variations.
Anthropic's Claude:
?
Claude 3 Opus (Coming soon)
Anthropic’s most powerful AI model, with top-level performance on highly complex tasks. It can navigate open-ended prompts and sight-unseen scenarios with remarkable fluency and human-like understanding.
Max Tokens: 200K
Languages: English, Spanish, Japanese, and multiple other languages.
Supported Use Cases: Task automation, interactive coding, research review, brainstorming and hypothesis generation, advanced analysis of charts & graphs, financials and market trends, forecasting.
Claude 3 Sonnet (Available now)
Claude 3 Sonnet strikes the ideal balance between intelligence and speed—particularly for enterprise workloads. It offers maximum utility, and is engineered to be the dependable for scaled AI deployments.
Max tokens: 200K
Languages: English, Spanish, Japanese, and multiple other languages.
Supported use cases: RAG or search & retrieval over vast amounts of knowledge, product recommendations, forecasting, targeted marketing, code generation, quality control, parse text from images.
Claude 3 Haiku (Coming soon)
Antropic’s fastest, most compact model for near-instant responsiveness. It answers simple queries and requests with speed.
Max tokens: 200K
Languages: English, Spanish, Japanese, and multiple other languages.
Supported use cases: Quick and accurate support in live interactions, translations, content moderation, optimize logistics, inventory management, extract knowledge from unstructured data.
Claude 2.1
Claude 2.1 is Anthropic’s latest large language model (LLM) with an industry-leading 200K token context window, reduced hallucination rates, and improved accuracy over long documents.
Max tokens: 200K
Languages: English and multiple other languages
Supported use cases: Summarization, Q&A, trend forecasting, comparing and contrasting multiple documents, and analysis. Claude 2.1 excels at the core capabilities of Claude 2.0 and Claude Instant.
Claude 2.0
Claude 2.0 is a leading LLM from Anthropic that enables a wide range of tasks from sophisticated dialogue and creative content generation to detailed instruction.
Max tokens: 100K
Languages: English and multiple other languages
Supported use cases: Thoughtful dialogue, content creation, complex reasoning, creativity, and coding.
Claude 1.3
Claude 1.3 is an earlier version of Anthropic's general-purpose LLM.
Max tokens: 100K
Languages: English and multiple other languages
Supported use cases: Search, writing, editing, outlining, and summarizing text; coding; and providing helpful advice about a broad range of subjects.
Claude Instant
Claude Instant is Anthropic's faster, lower-priced yet very capable LLM.
Max tokens: 100K
Languages: English and multiple other languages
Supported use cases: Casual dialogue, text analysis, summarization, and document comprehension.
AI21 Labs :
Jurassic-2 Ultra
AI21’s most powerful model for complex text generation tasks that require the highest quality output.
Max tokens: 8,192
Languages: English, Spanish, French, German, Portuguese, Italian, and Dutch
Supported use cases: Question answering, summarization, draft generation, advanced information extraction, ideation for tasks requiring intricate reasoning and logic.
Jurassic-2 Mid
AI21’s mid-sized model for advanced text generation tasks that require both quality and affordability.
Max tokens: 8,192
Languages: English, Spanish, French, German, Portuguese, Italian, and Dutch
Supported use cases: Question answering, summarization, draft generation, advanced information extraction, ideation.
Cohere :
Command
Command is Cohere’s generative large language model (LLM) (52B parameters).
Max tokens: 4K
Languages: English
Supported use cases: Chat, text generation, text summarization.
Command Light
Command Light is a smaller version of Command, Cohere's generative LLM (6B parameters).
Max tokens: 4K
Languages: English
Supported use cases: Chat, text generation, text summarization.
Embed - English
Embed is Cohere's text representation, or embeddings, model. This version supports English only.
Dimensions: 1024
Languages: English
Supported use cases: Semantic search, retrieval augmented generation (RAG), classification, clustering.
Embed - Multilingual
Embed is Cohere's text representation, or embeddings, model. This version supports multiple languages.
Dimensions: 1024
Languages: Multilingual (100+ supported languages)
Supported use cases: Semantic search, retrieval-augmented generation (RAG), classification, clustering.
Meta :
Llama-2-13b-chat
Fine-tuned model in the parameter size of 13B. Suitable for smaller-scale tasks such as text classification, sentiment analysis, and language translation.
Max tokens: 4K
Languages: English
Supported use cases: Assistant-like chat
Llama-2-70b-chat
Fine-tuned model in the parameter size of 70B. Suitable for larger-scale tasks such as language modeling, text generation, and dialogue systems.
Max tokens: 4K
Languages: English
Supported use cases: Assistant-like chat
Stable Diffusion XL :
Stable Diffusion XL 1.0
The most advanced text-to-image model from Stability AI.
Max tokens: 77-token limit for prompts
Languages: English
Supported use cases: Advertising and marketing, media and entertainment, gaming and metaverse.
Stable Diffusion XL 0.8
Text-to-image model from Stability AI.
Max tokens: 77-token limit for prompts
Languages: English
Supported use cases: Advertising and marketing, media and entertainment, gaming and metaverse.
领英推荐
Mistral AI :
Mistral 7B
A 7B dense Transformer, fast-deployed and easily customizable. Small, yet
powerful for a variety of use cases.
Max tokens: 8K
Languages: English
Supported use cases: Text summarization, structuration, question answering,
and code completion
Mixtral 8X7B
A 7B sparse Mixture-of-Experts model with stronger capabilities than Mistral AI
7B. Uses 12B active parameters out of 45B total.
Max tokens: 32K
Languages: English, French, German, Spanish, Italian
Supported use cases: Text summarization, structuration, question answering,
and code completion.
====================================================
Randomness and diversity for Text and Chat Models:
Temperature: Large language models (LLMs) use probability to construct the words in a sequence. For any given sequence, there is a probability distribution of options for the next word in the sequence. When you set the temperature closer to zero, the model tends to select the higher-probability words. When you set the temperature farther away from zero, the model might select a lower-probability word.
T closer to 0? => Model Selects Higher Probability Words
T farther away from 0 => Model Selects Lower Probability Words
Top P: Top P defines a cutoff based on the sum of probabilities of the potential choices. If you set Top P below 1.0, the model considers the most probable options and ignores the less probable ones.
P < 1.0 => Considers most probable options and ignores the less probable ones.
?
Response length: The response length configures the maximum number of tokens to use in the generated response.
Stop sequences: A stop sequence is a sequence of characters. If the model encounters a stop sequence, it stops generating further tokens. Different models support different types of characters in a stop sequence, different maximum sequence lengths, and may support the definition of multiple stop sequences.
Setting up Bedrock:
=> Search Amazon Bedrock in your Management Console.
=> Click Get started!
=> Take a look at the FMs in the Particular Region. (It changes with Region)
List of Regions :
Note : These are the five regions where Amazon Bedrock can be accessed from!
US East (N. Virginia):
AI21 Labs
Jurassic-2 Ultra
Jurassic-2 Mid
Amazon
Titan Embeddings G1 - Text
Titan Text G1 - Lite
Titan Text G1 - Express
Titan Image Generator G1
Titan Multimodal Embeddings G1
Anthropic
Claude 3 Sonnet
Claude
Claude Instant
Cohere
Command
Command Light
Embed English
Embed Multilingual
Meta
Llama 2 Chat 13B
Llama 2 Chat 70B
Llama 2 13B
Llama 2 70B
Stability AI
SDXL 0.8
SDXL 1.0
US West (Oregon):
AI21 Labs
Jurassic-2 Ultra
Jurassic-2 Mid
Amazon
Titan Embeddings G1 - Text
Titan Text G1 - Lite
Titan Text G1 - Express
Titan Image Generator G1
Titan Multimodal Embeddings G1
Anthropic
Claude 3 Sonnet
Claude
Claude Instant
Cohere
Command
Command Light
Embed English
Embed Multilingual
Meta
Llama 2 Chat 13B
Llama 2 Chat 70B
Llama 2 13B
Llama 2 70B
Mistral AI
Mistral 7B Instruct
Mixtral 8x7B Instruct
Stability AI
SDXL 0.8
SDXL 1.0
Asia Pacific (Singapore):
Anthropic
Claude
Claude Instant
Asia Pacific (Tokyo):
Amazon
Titan Embeddings G1 - Text
Titan Text G1 - Express
Anthropic
Claude
Claude Instant
Europe (Frankfurt):
Amazon
Titan Embeddings G1 - Text
Titan Text G1 - Express
Anthropic
Claude
Claude Instant
=> Select Model Access to get access for the FM which you wish to work with.
=> Select Manage model Access and click the checkboxes which you want.
=> Click the save button at the bottom to get access to that model.
Pricing:
On-Demand
With the On-Demand mode, you only pay for what you use, with no time-based term commitments. For text generation models, you are charged for every input token processed and every output token generated. For embeddings models, you are charged for every input token processed. A token is comprised of a few characters and refers to the basic unit of text that a model learns to understand the user input and prompt. For image generation models, you are charged for every image generated.
Batch
With Batch mode, you can provide a set of prompts as a single input file and receive responses as a single output file, allowing you to get simultaneous large-scale predictions. The responses are processed and stored in your Amazon S3 bucket so you can access them at a later time. Pricing for Batch mode is the same as pricing for On-Demand mode.
Provisioned Throughput
With the Provisioned Throughput mode, you can purchase model units for a specific base or custom model. The Provisioned Throughput mode is primarily designed for large consistent inference workloads that need guaranteed throughput. Custom models can only be accessed using Provisioned Throughput. A model unit provides a certain throughput, which is measured by the maximum number of input or output tokens processed per minute. With the Provisioned Throughput pricing, you are charged by the hour, you have the flexibility to choose between 1-month or 6-month commitment terms.
Model customization
With Amazon Bedrock, you can customize FMs with your data to deliver tailored responses for specific tasks and your business context. You can fine-tune models with labeled data or using continued pre-training with unlabeled data. For customization of a text generation model, you are charged for the model training based on the total number of tokens processed by the model (number of tokens in the training data corpus times the number of epochs) and for model storage charged per month per model. An epoch refers to one full pass through your training dataset during fine-tuning or continued pre-training. Inferences using customized models are charged under the Provisioned Throughput plan and requires you purchase Provisioned Throughput. One model unit is made available at no commitment term for inference on a customized model. You will be charged for the number of hours that the first model unit you use for custom model inference. If you want to increase your throughput beyond one model unit, then you must purchase a 1-month or 6-month commitment term.
Playgrounds:
=> Playground is a place where you can test the FMs and have some fun with it.
There are three Playgrounds,
Chat Playground
Text Playground
Image Playground
=> Select the Model which you wish to play with and have some fun!
I have an additional Playground for you ??
Amazon PartyRock:
=> Develop your own GenAI application with a Single prompt and learn GenAI fundamentals without any coding part.
=> The app then gives me the exact same thing which I asked in the prompt.
=> Feel free to play around with it by changing its configuration.
=> You can construct as many widgets as you can in this PartyRock app.
=> I’ll provide you with my public PartyRock app in this Article.?
AWS Study Buddy - https://partyrock.aws/u/JK/ev73BeS66/AWS-StudyBuddy
Advantages of Amazon Bedrock:
Disadvantages of Amazon Bedrock:
Thank you so much!! Hope you enjoyed my content <3?
Happy Learning!
People who inspired me :
Jen Looper Aditi Sawhney Padmini Subramanian Tracy Wang Lucy Wang Stéphane Maarek Ilanchezhian Ganesamurthy Ayyanar Jeyakrishnan (AJ) Vishal Alhat ?? Dheeraj Choudhary Abinaya Devi S V Aliya Shaikh Sabiha Ashik Krishnan Balasekar Marishwaran K A V Giri Venkatesan Jiju Thomas Mathew Maheswarakumar Muthusamy MBA,PRINCE2 ?Practitioner ,ITIL,SFC, IICS Bhuvana R. Raghul Gopal Ashish Prajapati Murali Doss Aarthi Ranganathan Sridevi Murugayen Jones Zachariah Noel N Brooke Jamieson Greg Powell Shubham Londhe Viktor Ardelean Vedha Sankar
Exploring Amazon Bedrock opens up infinite possibilities ?? - like Elon Musk says, persistent pursuit is key to innovation. Let's keep pushing boundaries! #Innovation #TechFuture ?
Cloud AppDev @ AWS | 6x AWS Certified | LinkedIn Top Voice | Generative AI | ID&E | Women in STEM | Award-Winning Industry Mentor | Thoughts are my own.
1 年Awesome KAUSHIK J!
AWS ML Hero - AWS Certified 15X | AWS UG Bangalore - Co-Organizer | Multi Cloud Architect | TOGAF Certified | Opinions are my own and not the views of Employer
1 年Good article KAUSHIK J
AWS DevOps Architect | AI-Enabled Solutions | Scrum Master | CI/CD Expert | 5x AWS Certified | 33+ Years in IT | TOGAF 9 Certified
1 年Wonderful keep up the good work KAUSHIK J
Consultant HCL (Infrastructure Division) ,AWS Solutions Architect Professional, Agile Practitioner, AWS Glue , DevOps, Jenkins CI/CD, InformaticaMDM & SAP BODS, Microsoft Certified Data Engineer, Alibaba Cloud MVP
1 年Love this