Getting Started with Amazon Bedrock

Getting Started with Amazon Bedrock

Amazon Bedrock:

It is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon via a single API.

GenAI

Why Amazon Bedrock?

Using Amazon Bedrock, you can easily experiment with and evaluate top FMs for your use case, privately customize them with your data using techniques such as fine-tuning and Retrieval Augmented Generation (RAG), and build agents that execute tasks using your enterprise systems and data sources.

Since Amazon Bedrock is Serverless, you don't have to manage any infrastructure, and you can securely integrate and deploy generative AI capabilities into your applications using the AWS services you are already familiar with.

What is FM?

FM stands for Foundational Model. Trained on massive datasets, A foundation model (FMs) is a type of machine learning (ML) model that is pre trained to perform a range of tasks.

What is RAG?

RAG stands for Retrieval Augmented Generation. It is an AI framework for retrieving facts from external Sources to increase the accuracy of the information generated by the LLM.

When the LLMs have insufficient data about the user’s request, it generates a random less accurate value which can match that prompt. It is also called as “AI Hallucination”.??

Available Models:

  1. Amazon Titan
  2. Claude
  3. Command & Embed
  4. Jurassic
  5. Llama 2
  6. Mistral AI
  7. Stable Diffusion

Use Cases Of Bedrock:

1. Text Generation:

Create new pieces of original content such as Essays, Short Stories or can also give you some great suggestions.

Say “suggest me some Tamil movies to watch. The genre should be romance”

Text Generation

Chatbots:

Build Conversational interfaces such as Chatbots and Virtual Assistants to enhance the UX for the users.

Search:

  1. It provides you some additional information based on your search.?
  2. It uses your search prompt.
  3. It may give you some tips about your prompt.

Google's Experiment

Image Generation:

  1. Can Generate Images Based on your Prompt.

Image Generation


Tokens:

??????Token limits are restrictions on the number of tokens that an LLM can process in a single interaction. A token is a unit of text that is used to represent a word, phrase, or other piece of text. For example, the phrase "I love you" would consist of 5 tokens: "I", "love", "you", ".", and " ".

Why are token limits relevant?

Token limits are relevant because they can affect the performance of LLMs. If the token limit is too low, the LLM may not be able to generate the desired output.

For example, if you are trying to generate a 1000-word document but the token limit is 1000, the LLM will only be able to generate the first 1000 tokens.

Whereas if it’s too high, the LLM is going to be very slow and require very high computational power.

Now let’s see the various model versions of the models provided by Amazon Bedrock.

====================================================

Amazon Titan:

  1. Titan Text Express
  2. Titan Text Lite
  3. Titan Text Embeddings
  4. Titan Multimodal Embeddings
  5. Titan Image Generator


Titan Text Express

LLM offering a balance of price and performance.

Max tokens: 8K

Languages: English (GA), 100+ languages available (Preview)

Fine-tuning supported: Yes

Supported use cases: Retrieval augmented generation, open-ended text generation, brainstorming, summarization, code generation, table creation, data formatting, paraphrasing, chain of thought, rewrite, extraction, Q&A, and chat.


Titan Text Lite

Cost-effective and highly customizable LLM. Right-sized for specific use cases, ideal for text generation tasks and fine-tuning.

Max tokens: 4K

Languages: English

Fine-tuning supported: Yes

Supported use cases:? Summarization and copywriting.


Titan Text Embeddings

LLM that translates text into numerical representations.

Max tokens: 8K

Languages: 25+ languages

Fine-tuning supported: No

Embeddings: 1,536

Supported use cases: Text retrieval, semantic similarity, and clustering.


Titan Multimodal Embeddings

Powers accurate multimodal search and recommendation experiences.

Max tokens: 128

Max images size: 25 MB

Languages: English

Embeddings: 1,024 (default), 384, 256

Fine-tuning supported: Yes

Supported use cases: Search, recommendation, personalization


Titan Image Generator

Generate realistic, studio-quality images using text prompts.

Max tokens: 77

Max input image size: 25 MB

Languages: English

Fine-tuning supported: Yes

Supported use cases: Text to image generation, image editing, image variations.


Anthropic's Claude:

  1. Claude 3 Opus (Coming soon)
  2. Claude 3 Sonnet (Available now)
  3. Claude 3 Haiku (Coming soon)
  4. Claude 2.1
  5. Claude 2.0
  6. Claude 1.3
  7. Claude Instant

?

Claude 3 Opus (Coming soon)

Anthropic’s most powerful AI model, with top-level performance on highly complex tasks. It can navigate open-ended prompts and sight-unseen scenarios with remarkable fluency and human-like understanding.

Max Tokens: 200K

Languages: English, Spanish, Japanese, and multiple other languages.

Supported Use Cases: Task automation, interactive coding, research review, brainstorming and hypothesis generation, advanced analysis of charts & graphs, financials and market trends, forecasting.

Claude 3 Sonnet (Available now)

Claude 3 Sonnet strikes the ideal balance between intelligence and speed—particularly for enterprise workloads. It offers maximum utility, and is engineered to be the dependable for scaled AI deployments.

Max tokens: 200K

Languages: English, Spanish, Japanese, and multiple other languages.

Supported use cases: RAG or search & retrieval over vast amounts of knowledge, product recommendations, forecasting, targeted marketing, code generation, quality control, parse text from images.

Claude 3 Haiku (Coming soon)

Antropic’s fastest, most compact model for near-instant responsiveness. It answers simple queries and requests with speed.

Max tokens: 200K

Languages: English, Spanish, Japanese, and multiple other languages.

Supported use cases: Quick and accurate support in live interactions, translations, content moderation, optimize logistics, inventory management, extract knowledge from unstructured data.

Claude 2.1

Claude 2.1 is Anthropic’s latest large language model (LLM) with an industry-leading 200K token context window, reduced hallucination rates, and improved accuracy over long documents.

Max tokens: 200K

Languages: English and multiple other languages

Supported use cases: Summarization, Q&A, trend forecasting, comparing and contrasting multiple documents, and analysis. Claude 2.1 excels at the core capabilities of Claude 2.0 and Claude Instant.

Claude 2.0

Claude 2.0 is a leading LLM from Anthropic that enables a wide range of tasks from sophisticated dialogue and creative content generation to detailed instruction.

Max tokens: 100K

Languages: English and multiple other languages

Supported use cases: Thoughtful dialogue, content creation, complex reasoning, creativity, and coding.

Claude 1.3

Claude 1.3 is an earlier version of Anthropic's general-purpose LLM.

Max tokens: 100K

Languages: English and multiple other languages

Supported use cases: Search, writing, editing, outlining, and summarizing text; coding; and providing helpful advice about a broad range of subjects.

Claude Instant

Claude Instant is Anthropic's faster, lower-priced yet very capable LLM.

Max tokens: 100K

Languages: English and multiple other languages

Supported use cases: Casual dialogue, text analysis, summarization, and document comprehension.


AI21 Labs :

  1. Jurassic-2 Ultra
  2. Jurassic-2 Mid

Jurassic-2 Ultra

AI21’s most powerful model for complex text generation tasks that require the highest quality output.

Max tokens: 8,192

Languages: English, Spanish, French, German, Portuguese, Italian, and Dutch

Supported use cases: Question answering, summarization, draft generation, advanced information extraction, ideation for tasks requiring intricate reasoning and logic.

Jurassic-2 Mid

AI21’s mid-sized model for advanced text generation tasks that require both quality and affordability.

Max tokens: 8,192

Languages: English, Spanish, French, German, Portuguese, Italian, and Dutch

Supported use cases: Question answering, summarization, draft generation, advanced information extraction, ideation.


Cohere :

  1. Command
  2. Command Light
  3. Embed - English
  4. Embed - Multilingual


Command

Command is Cohere’s generative large language model (LLM) (52B parameters).

Max tokens: 4K

Languages: English

Supported use cases: Chat, text generation, text summarization.

Command Light

Command Light is a smaller version of Command, Cohere's generative LLM (6B parameters).

Max tokens: 4K

Languages: English

Supported use cases: Chat, text generation, text summarization.

Embed - English

Embed is Cohere's text representation, or embeddings, model. This version supports English only.

Dimensions: 1024

Languages: English

Supported use cases: Semantic search, retrieval augmented generation (RAG), classification, clustering.

Embed - Multilingual

Embed is Cohere's text representation, or embeddings, model. This version supports multiple languages.

Dimensions: 1024

Languages: Multilingual (100+ supported languages)

Supported use cases: Semantic search, retrieval-augmented generation (RAG), classification, clustering.


Meta :

  1. Llama-2-13b-chat
  2. Llama-2-70b-chat


Llama-2-13b-chat

Fine-tuned model in the parameter size of 13B. Suitable for smaller-scale tasks such as text classification, sentiment analysis, and language translation.

Max tokens: 4K

Languages: English

Supported use cases: Assistant-like chat

Llama-2-70b-chat

Fine-tuned model in the parameter size of 70B. Suitable for larger-scale tasks such as language modeling, text generation, and dialogue systems.

Max tokens: 4K

Languages: English

Supported use cases: Assistant-like chat


Stable Diffusion XL :

  1. Stable Diffusion XL 1.0
  2. Stable Diffusion XL 0.8


Stable Diffusion XL 1.0

The most advanced text-to-image model from Stability AI.

Max tokens: 77-token limit for prompts

Languages: English

Supported use cases: Advertising and marketing, media and entertainment, gaming and metaverse.


Stable Diffusion XL 0.8

Text-to-image model from Stability AI.

Max tokens: 77-token limit for prompts

Languages: English

Supported use cases: Advertising and marketing, media and entertainment, gaming and metaverse.


Mistral AI :

  1. Mistral 7B
  2. Mixtral 8X7B


Mistral 7B

A 7B dense Transformer, fast-deployed and easily customizable. Small, yet

powerful for a variety of use cases.

Max tokens: 8K

Languages: English

Supported use cases: Text summarization, structuration, question answering,

and code completion


Mixtral 8X7B

A 7B sparse Mixture-of-Experts model with stronger capabilities than Mistral AI

7B. Uses 12B active parameters out of 45B total.

Max tokens: 32K

Languages: English, French, German, Spanish, Italian

Supported use cases: Text summarization, structuration, question answering,

and code completion.

====================================================

Randomness and diversity for Text and Chat Models:

Temperature: Large language models (LLMs) use probability to construct the words in a sequence. For any given sequence, there is a probability distribution of options for the next word in the sequence. When you set the temperature closer to zero, the model tends to select the higher-probability words. When you set the temperature farther away from zero, the model might select a lower-probability word.

T closer to 0? => Model Selects Higher Probability Words

T farther away from 0 => Model Selects Lower Probability Words


Top P: Top P defines a cutoff based on the sum of probabilities of the potential choices. If you set Top P below 1.0, the model considers the most probable options and ignores the less probable ones.

P < 1.0 => Considers most probable options and ignores the less probable ones.

?

Response length: The response length configures the maximum number of tokens to use in the generated response.

Stop sequences: A stop sequence is a sequence of characters. If the model encounters a stop sequence, it stops generating further tokens. Different models support different types of characters in a stop sequence, different maximum sequence lengths, and may support the definition of multiple stop sequences.


Setting up Bedrock:

=> Search Amazon Bedrock in your Management Console.

=> Click Get started!

Amazon Bedrock

=> Take a look at the FMs in the Particular Region. (It changes with Region)

Foundational Models

List of Regions :

  1. US East (N. Virginia)
  2. US West (Oregon)
  3. Asia Pacific (Singapore)
  4. Europe (Frankfurt)
  5. Asia Pacific (Tokyo)

Note : These are the five regions where Amazon Bedrock can be accessed from!

US East (N. Virginia):

AI21 Labs

Jurassic-2 Ultra

Jurassic-2 Mid


Amazon

Titan Embeddings G1 - Text

Titan Text G1 - Lite

Titan Text G1 - Express

Titan Image Generator G1

Titan Multimodal Embeddings G1


Anthropic

Claude 3 Sonnet

Claude

Claude Instant


Cohere

Command

Command Light

Embed English

Embed Multilingual


Meta

Llama 2 Chat 13B

Llama 2 Chat 70B

Llama 2 13B

Llama 2 70B


Stability AI

SDXL 0.8

SDXL 1.0


US West (Oregon):

AI21 Labs

Jurassic-2 Ultra

Jurassic-2 Mid


Amazon

Titan Embeddings G1 - Text

Titan Text G1 - Lite

Titan Text G1 - Express

Titan Image Generator G1

Titan Multimodal Embeddings G1


Anthropic

Claude 3 Sonnet

Claude

Claude Instant


Cohere

Command

Command Light

Embed English

Embed Multilingual


Meta

Llama 2 Chat 13B

Llama 2 Chat 70B

Llama 2 13B

Llama 2 70B


Mistral AI

Mistral 7B Instruct

Mixtral 8x7B Instruct


Stability AI

SDXL 0.8

SDXL 1.0


Asia Pacific (Singapore):

Anthropic

Claude

Claude Instant


Asia Pacific (Tokyo):

Amazon

Titan Embeddings G1 - Text

Titan Text G1 - Express


Anthropic

Claude

Claude Instant


Europe (Frankfurt):

Amazon

Titan Embeddings G1 - Text

Titan Text G1 - Express


Anthropic

Claude

Claude Instant


Regions which supports Bedrock

=> Select Model Access to get access for the FM which you wish to work with.

=> Select Manage model Access and click the checkboxes which you want.

=> Click the save button at the bottom to get access to that model.

Requesting Model Access

Pricing:

On-Demand

With the On-Demand mode, you only pay for what you use, with no time-based term commitments. For text generation models, you are charged for every input token processed and every output token generated. For embeddings models, you are charged for every input token processed. A token is comprised of a few characters and refers to the basic unit of text that a model learns to understand the user input and prompt. For image generation models, you are charged for every image generated.


Batch

With Batch mode, you can provide a set of prompts as a single input file and receive responses as a single output file, allowing you to get simultaneous large-scale predictions. The responses are processed and stored in your Amazon S3 bucket so you can access them at a later time. Pricing for Batch mode is the same as pricing for On-Demand mode.


Provisioned Throughput

With the Provisioned Throughput mode, you can purchase model units for a specific base or custom model. The Provisioned Throughput mode is primarily designed for large consistent inference workloads that need guaranteed throughput. Custom models can only be accessed using Provisioned Throughput. A model unit provides a certain throughput, which is measured by the maximum number of input or output tokens processed per minute. With the Provisioned Throughput pricing, you are charged by the hour, you have the flexibility to choose between 1-month or 6-month commitment terms.


Model customization

With Amazon Bedrock, you can customize FMs with your data to deliver tailored responses for specific tasks and your business context. You can fine-tune models with labeled data or using continued pre-training with unlabeled data. For customization of a text generation model, you are charged for the model training based on the total number of tokens processed by the model (number of tokens in the training data corpus times the number of epochs) and for model storage charged per month per model. An epoch refers to one full pass through your training dataset during fine-tuning or continued pre-training. Inferences using customized models are charged under the Provisioned Throughput plan and requires you purchase Provisioned Throughput. One model unit is made available at no commitment term for inference on a customized model. You will be charged for the number of hours that the first model unit you use for custom model inference. If you want to increase your throughput beyond one model unit, then you must purchase a 1-month or 6-month commitment term.


Playgrounds:

=> Playground is a place where you can test the FMs and have some fun with it.


There are three Playgrounds,

Chat Playground

Text Playground

Image Playground


=> Select the Model which you wish to play with and have some fun!


Chat Playground
Text Playground
Image Playground

I have an additional Playground for you ??

Amazon PartyRock:

Home page of PartyRock
No code App

=> Develop your own GenAI application with a Single prompt and learn GenAI fundamentals without any coding part.

Give the Prompt
Prompt given

=> The app then gives me the exact same thing which I asked in the prompt.

=> Feel free to play around with it by changing its configuration.

Final Output

=> You can construct as many widgets as you can in this PartyRock app.

=> I’ll provide you with my public PartyRock app in this Article.?

AWS Study Buddy - https://partyrock.aws/u/JK/ev73BeS66/AWS-StudyBuddy

Advantages of Amazon Bedrock:

  1. Simplifies development: Single platform for various models.
  2. Reduces costs: Pre-trained models and managed infrastructure.
  3. Secure and responsible AI: Built-in security and privacy features.
  4. Scalability: Serverless architecture for easy scaling.
  5. Keeps you updated: Allows exploration of latest advancements in AI.

Disadvantages of Amazon Bedrock:

  1. Limited control: Less control over infrastructure and training.
  2. Potential vendor lock-in: Reliance on a single vendor.
  3. Costs involved: Service usage incurs costs.
  4. Limited transparency: Inner workings of models might not be clear.

Thank you so much!! Hope you enjoyed my content <3?

Happy Learning!


People who inspired me :

Jen Looper Aditi Sawhney Padmini Subramanian Tracy Wang Lucy Wang Stéphane Maarek Ilanchezhian Ganesamurthy Ayyanar Jeyakrishnan (AJ) Vishal Alhat ?? Dheeraj Choudhary Abinaya Devi S V Aliya Shaikh Sabiha Ashik Krishnan Balasekar Marishwaran K A V Giri Venkatesan Jiju Thomas Mathew Maheswarakumar Muthusamy MBA,PRINCE2 ?Practitioner ,ITIL,SFC, IICS Bhuvana R. Raghul Gopal Ashish Prajapati Murali Doss Aarthi Ranganathan Sridevi Murugayen Jones Zachariah Noel N Brooke Jamieson Greg Powell Shubham Londhe Viktor Ardelean Vedha Sankar

Exploring Amazon Bedrock opens up infinite possibilities ?? - like Elon Musk says, persistent pursuit is key to innovation. Let's keep pushing boundaries! #Innovation #TechFuture ?

Aliya Shaikh

Cloud AppDev @ AWS | 6x AWS Certified | LinkedIn Top Voice | Generative AI | ID&E | Women in STEM | Award-Winning Industry Mentor | Thoughts are my own.

1 年

Awesome KAUSHIK J!

Ayyanar Jeyakrishnan (AJ)

AWS ML Hero - AWS Certified 15X | AWS UG Bangalore - Co-Organizer | Multi Cloud Architect | TOGAF Certified | Opinions are my own and not the views of Employer

1 年

Good article KAUSHIK J

Jiju Thomas Mathew

AWS DevOps Architect | AI-Enabled Solutions | Scrum Master | CI/CD Expert | 5x AWS Certified | 33+ Years in IT | TOGAF 9 Certified

1 年

Wonderful keep up the good work KAUSHIK J

Maheswarakumar Muthusamy MBA,PRINCE2 ?Practitioner ,ITIL,SFC, IICS

Consultant HCL (Infrastructure Division) ,AWS Solutions Architect Professional, Agile Practitioner, AWS Glue , DevOps, Jenkins CI/CD, InformaticaMDM & SAP BODS, Microsoft Certified Data Engineer, Alibaba Cloud MVP

1 年

Love this

要查看或添加评论,请登录

KAUSHIK J的更多文章

  • Sensitive Data Detection

    Sensitive Data Detection

    Classifying a document based on detecting it's Sensitive Data Intro : A Document can be classified as sensitive not by…

    8 条评论
  • My Cloud Journey

    My Cloud Journey

    Hello LinkedIn fam, The weather is pleasant out here and hope you are all in great health! It's been 3 months since I…

    12 条评论

社区洞察

其他会员也浏览了