登录查看更多内容

Getting Started with Gemini 1.5 Pro and Google AI Studio:

chamindu lakshan

Out of the box thinker/YouTubepreneuer/programmer/Wordpress and Wix Designer

发布日期: 2024年4月20日

What is Google AI Studio?

Google AI Studio is a web-based environment where developers can write, run, and test prompts using Google’s Gemini models. Additionally, if you want to use the Gemini API, you can get your API key from inside Google AI Studio. Broadly, it is designed to be a simple entry point for developers to not just use models but also get started building with the Gemini API. If you don’t want to use the Gemini API, you can skip the API key step all together and just test the models.

The Google AI Studio Basics

If you’re familiar with OpenAI’s Playground, then much of what is about to be discussed will be familiar. Let’s walk through the basic UI as is shown below:

Regardless of what mode you select, the “Run Settings” will be the same.

Model: Currently, Google is offering three different models, Gemini 1.0 Pro, Gemini 1.5 Pro, and Gemini 1.0 Pro 001 (Tuning). Each of these models has their own unique benefits. For example, Gemini 1.5 Pro allows the user to insert images in addition to video, audio, and other files. You can learn more about Google’s LLMs in their documentation.
Temperature: This variable controls the “creativity level” of the model. By increasing this value, the model will choose less statistically likely tokens when creating responses. The best way to understand the impact of this variable is to experiment with it yourself and see how the outputs change.
Stop Sequence: This variable makes the model stop generating tokens when a specific word/phrase is generated. For example, if my stop sequence was “world” and I prompted the model to say “Hello world”, the output generated would be “Hello”. This means that the stop sequence will never be displayed/created by the model.

Safety Settings: Due to the nature of Large Language Models, responses can sometimes be unpredictable. While steps have been taken to ensure the model generates appropriate responses, Google created this management tool to ensure developers can better control outputs.

Top K: This variable determines if the model will select the most probable next tokens. A high Top K value will have the model consider a larger number of (lower probability) tokens while a higher Top K value will make the outputs more deterministic. In other words, the higher the Top K value, the more predictable the model is.
Top P: This setting is available under the “Advanced settings” in the bottom right corner. This variable influences the amount of tokens that the model considers when generating a response. To be clear, this variable does not impact the context window, because the context window is used in the input phase (before the generation has started) and the Top P only impacts how the response is generated. In other words, the Top P value dictates the randomness of the model’s output.

Google AI Studio’s different Modes

Currently, Google AI Studio is offering three distinct modes when creating with the Gemini API. These options can be selected by clicking “Create new” in the top left corner, as seen below.

领英推荐

Open-Source AI vs. Proprietary Models: Pros and Cons -…

Analytics Insight? 1 个月前

Exploring Google Gemini 1.5 Pro

GDG Cloud Lahore 11 个月前

Turning browsers into smart agents with GPT + ARIA

The General Partnership 1 年前

Each of these modes is meant to address a specific use case.

Chat Prompt: This commonly seen prompt type is used in chatbots like ChatGPT and Google’s Gemini chatbot. It is used to respond to user queries in a conversational manner. This is where you can customize the chatbot to speak or act in a certain way. Want a friendly custom service chatbot? A sarcastic chatbot that talks down to you? This is where you would tell the model to act in a certain way.

Freeform Prompt: This prompt type is used for open-ended responses. Creative writing, brainstorming, and learning assistance can all excel using this mode. Many writing tools and products use this or similar processes.
Structured Prompt: Uniquely, this prompt type has the user provide examples (sample data) of queries and responses to the desired behaviour. In the below example, I gave sample questions and responses about which US cities are best. Obviously, this question will be different for everyone, but we can see that the model followed the examples it was given.

How to use the multimodal features available in Google AI Studio

One of the most unique features of the Google AI Studio is that various file types can be used in the environment. These include images, videos, audio, and files from Google Drive. This means that developers can easily test out if their idea works, and how to work through any bugs. For example, if we use the Chat Prompt from above and add a video we want to be summarized, the model will access the inserted video and accomplish its task. Let’s take a look.

In this case, we used a five-minute video that showcased various dinosaur fossils. When prompted, the model can interact with the video and produce a summary of its contents. These types of use cases can’t be done in other AI Playgrounds (OpenAI for example).

When to use Google AI Studio vs Gemini

While the Google AI Studio is a powerful tool, it’s important to understand when you should use it vs Google’s Gemini chatbot. The Gemini chatbot is Google’s equivalent to OpenAI’s ChatGPT. Users can expect to engage with the model through conversation and with limited control of the reasoning and response. Alternatively, if users intend to make changes to the way the Gemini model responds or need an API key, Google AI Studio is the tool to achieve this. After testing and creating a project in the Studio, users can export their work directly to code by clicking “Get Code” in the top right corner. Once outside the Studio and connected to the Gemini API, users can connect to other APIs like the Keymate.AI API. By doing so, they can utilize features like Keymate Memory and Keymate’s Confidence Scoring to help identify hallucinations. Overall, it was impressive to see what the Gemini models were capable of, especially around multi-modal use cases.

Thank you ????????

Chami Notes

509 位关注者

要查看或添加评论，请登录

chamindu lakshan的更多文章

The Truth About To-Do Lists: Why Productivity Isn’t About Doing It All

2024年12月27日

The Truth About To-Do Lists: Why Productivity Isn’t About Doing It All

The Productivity market is a booming industry. Pushing and convincing everyone you can and should get everything done…
"The Art and Evolution of Performance Reviews: From Rituals to Real Impact"

2024年11月29日

"The Art and Evolution of Performance Reviews: From Rituals to Real Impact"

Performance Reviews: Transforming the Awkward Ritual into Real Conversations Ah, performance reviews — the annual…
Three Lessons From 127 Venture Capital Rejections

2024年11月29日

Three Lessons From 127 Venture Capital Rejections

From Vision to Closure: Lessons from Building Ikaria Two and a half years ago, I began building Ikaria, a personalized…
? Self-Reflection Through AI: Discovering Strengths and Weaknesses ?

2024年11月28日

? Self-Reflection Through AI: Discovering Strengths and Weaknesses ?

? Self-Reflection Through AI: Discovering Strengths and Weaknesses ? Have you ever tried asking ChatGPT…
"Transformers in Machine Learning: A Deep Dive (Part 2)"

2024年11月8日

"Transformers in Machine Learning: A Deep Dive (Part 2)"

The Decoder Segment Okay, so far we understand how the encoder segment works — i.e.

1 条评论
Forget About Marketing. Focus on Building An Awesome Product

2024年10月28日

Forget About Marketing. Focus on Building An Awesome Product

There’s a popular belief that if you “just build an amazing product, customers will come.” In an ideal world, this…

2 条评论
The 10-minute rule, and more tips to kick off your week

2024年9月17日

The 10-minute rule, and more tips to kick off your week

?? We’ve got 106 days left until the end of 2024. Let’s make them count.
Design Principles Used by Apple: For Better User Experience

2024年9月9日

Design Principles Used by Apple: For Better User Experience

As a child visiting my aunt for the first time in her grand five-story apartment, which had striking paint colors and a…

1 条评论
Understanding the wp_rel_ugc() Function in WordPress

2024年9月6日

Understanding the wp_rel_ugc() Function in WordPress

## Understanding the wp_rel_ugc() Function in WordPress The function, introduced in WordPress 5.3, is a powerful tool…
Next.js SEO: Best Practices for Higher Rankings

2024年8月8日

Next.js SEO: Best Practices for Higher Rankings

In addition to leveraging Next.js features for SEO, there are several other best practices you should consider to…

See all articles

Getting Started with Gemini 1.5 Pro and Google AI Studio:

chamindu lakshan

Out of the box thinker/YouTubepreneuer/programmer/Wordpress and Wix Designer

What is Google AI Studio?

The Google AI Studio Basics

Google AI Studio’s different Modes

领英推荐

How to use the multimodal features available in Google AI Studio

When to use Google AI Studio vs Gemini

Chami Notes

509 位关注者

chamindu lakshan的更多文章

社区洞察

其他会员也浏览了

Business & Technology Snapshot by TUATARA – April 2024

How to Build Docs for LLMs

How to Use Pinata with Cursor, Zed, and other LLMs

Zero to One: Build your first LLM Ai

DataPanthy #83

OpenAI Launches DALL·E 2 Now Available in Beta with Pricing

GPTs, GPTs everywhere, but AGI nowhere

LangSmith

Deploying DeepSeek-R1 Locally with a Custom RAG Knowledge Data Base

Web ML Monthly #14: India loves TensorFlow.js, 3 new demos, Meta AI runs segment anything in browser!

What is Google AI Studio?

The Google AI Studio Basics

Google AI Studio’s different Modes

领英推荐

How to use the multimodal features available in Google AI Studio

When to use Google AI Studio vs Gemini

Chami Notes

509 位关注者

chamindu lakshan的更多文章

The Truth About To-Do Lists: Why Productivity Isn’t About Doing It All

"The Art and Evolution of Performance Reviews: From Rituals to Real Impact"

Three Lessons From 127 Venture Capital Rejections

? Self-Reflection Through AI: Discovering Strengths and Weaknesses ?

"Transformers in Machine Learning: A Deep Dive (Part 2)"

Forget About Marketing. Focus on Building An Awesome Product

The 10-minute rule, and more tips to kick off your week

Design Principles Used by Apple: For Better User Experience

Understanding the wp_rel_ugc() Function in WordPress

Next.js SEO: Best Practices for Higher Rankings

社区洞察

其他会员也浏览了

Business & Technology Snapshot by TUATARA – April 2024

How to Build Docs for LLMs

How to Use Pinata with Cursor, Zed, and other LLMs

Zero to One: Build your first LLM Ai

DataPanthy #83

OpenAI Launches DALL·E 2 Now Available in Beta with Pricing

GPTs, GPTs everywhere, but AGI nowhere

LangSmith

Deploying DeepSeek-R1 Locally with a Custom RAG Knowledge Data Base

Web ML Monthly #14: India loves TensorFlow.js, 3 new demos, Meta AI runs segment anything in browser!