Using AutoGen and LM- Studio with Any Open-Source LLM: A Tutorial
Santosh Kumar Pandey
Manager Advisory @ PwC AC | Solution Architect | AI Enthusiast | Developer
What is Agent in context of AI?
In the context of artificial intelligence (AI), an agent is an entity that can perceive its environment through input and act upon that environment through actuators to achieve specific goals(execute the workflows).
What is Microsoft AutoGen?
MS AutoGen (https://autogen-studio.com/autogen-studio-ui) is an open-source framework designed to simplify the development of applications using large language models (LLMs). The framework emphasizes an autonomous agent-based architecture, facilitating the automation of tasks and the coordination of multiple language models. Key features and components of AutoGen include:
What is lm-studio?
LM Studio (https://lmstudio.ai/) is a desktop application that simplifies the process of running and interacting with large language models (LLMs) on personal computers. It provides a user-friendly interface for deploying and utilizing LLMs, particularly those optimized to run on local hardware without needing extensive cloud resources. Key features of LM Studio include:
What is Ollma?
OLLMA, short for Open LLM App, is an open-source project focused on creating applications that leverage large language models (LLMs). It aims to provide tools and resources for developers to build, deploy, and manage applications powered by these models (locally). Key features of OLLMA include:
Hope the above small descriptions will help to understand the basics concept, lets get started.
Download and Install OLLMA for Windows:
Download and Install Autogen:
Download and Install LM-Studio:
By following these steps, you'll be able to set up and run OLLMA, Autogen Studio, and LM-Studio on your Windows machine, allowing you to explore and experiment with various models.
To run the model locally, follow these steps:
This will start the local server on https://localhost:1234/v1 with the default API key set to "lm-studio."
Now we need to register the model using the base URL and API key from the previous step:
Give the model a name, enter the API key, and input the base URL. Providing a description is optional but recommended to avoid confusion. Finally, click on "Test Model" to ensure everything is set up correctly.
Great! Now that our model is running on the local server, we need to create an agent to communicate with it. Follow these steps:
Okay, now that we have downloaded the model and created the user proxy agents, the final step is to create a workflow to execute the agent. Here's how:
Great! We are done. You can now test your workflow by clicking on "Test Workflow" and giving a prompt to the agent to see it in action. Alternatively, you can:
领英推荐
Enjoy experimenting with your model!
Note: The response time will depend on the machine you are using to run the model. Machines with higher configurations will generally respond faster.
?
##### Begin of Discussion #####
# from skills import Discussion # Import the function from skills.py
from typing import List
import uuid
import requests # to perform HTTP requests
from pathlib import Path
from openai import OpenAI
def generate_and_save_images(query: str, image_size: str = "1024x1024") -> List[str]:
"""
Function to paint, draw or illustrate images based on the users query or request. Generates images from a given query using OpenAI's DALL-E model and saves them to disk. Use the code below anytime there is a request to create an image.
:param query: A natural language description of the image to be generated.
:param image_size: The size of the image to be generated. (default is "1024x1024")
:return: A list of filenames for the saved images.
"""
client = OpenAI() # Initialize the OpenAI client
response = client.images.generate(model="dall-e-3", prompt=query, n=1, size=image_size) # Generate images
# List to store the file names of saved images
saved_files = []
# Check if the response is successful
if response.data:
for image_data in response.data:
# Generate a random UUID as the file name
file_name = str(uuid.uuid4()) + ".png" # Assuming the image is a PNG
file_path = Path(file_name)
img_url = image_data.url
img_response = requests.get(img_url)
if img_response.status_code == 200:
# Write the binary content to a file
with open(file_path, "wb") as img_file:
img_file.write(img_response.content)
print(f"Image saved to {file_path}")
saved_files.append(str(file_path))
else:
print(f"Failed to download the image from {img_url}")
else:
print("No image data found in the response!")
# Return the list of saved files
return saved_files
# Example usage of the function:
# generate_and_save_images("A cute baby sea otter")
#### End of Discussion ####
Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer
3 个月Your exploration of Large Language Models (LLMs) touches on the remarkable evolution from GPTs autoregressive approach to BERTs bidirectional encoding, both pivotal in advancing AI's text generation and understanding capabilities. Historically, this progression mirrors the shift from rule-based systems to statistical methods in natural language processing. While LLMs like GPT-4 offer enhanced text generation, they also grapple with challenges like content coherence and contextual understanding, similar to earlier models that struggled with sentence-level dependencies. How do you foresee the integration of sparse attention mechanisms in future LLMs addressing these challenges, and what potential impacts might this have on zero-shot learning and domain adaptation in specific industry applications?