登录查看更多内容

Using Groq from Semantic Kernel

András Velvárt

AI Tinkerer. Head of AI at HexaIO. AR/VR, HoloLens Consultant. 17 times Microsoft MVP. CEO at Response Ltd. Pluralsight and LinkedIn Learning Author, International Speaker.

发布日期: 2024年4月22日

What is Groq?

Groq (www.groq.com) is an engine to run Large Language Models with insane speeds. Meaning, the speed the AI generates text is 20-50 times faster than with the traditional approach of using dedicated GPUs in a data center.

Groq achieves this by using a combination of streamlined hardware and special software. You can learn more about how and why it works on their explanation page - but for now, it is enough to know that Groq can be insanely fast (and therefore efficient and cheap).

Groq can run a number of open source AI models, such as

Meta's LLAMA 2
the newly released LLAMA 3;
Mixtral 8x7B;
and Google's Gemma 7b

What is Semantic Kernel?

Semantic Kernel is Microsoft's approach to an AI middleware, and it is the best way to create real-world, production and enterprise-ready AI applications in C#. The Python and Java modules are also being developed.

The Problem

Semantic Kernel supports language models running on OpenAI's Services or Azure OpenAI. It does not explicitly provide support for any other service, such as Groq.

The solution

Fortunately, Groq offers an OpenAI compatible service endpoint at the base URL

领英推荐

?? GraphRAG's Biggest Problem Solved

Pascal Biese 3 个月前

OpenAI's o1 Outperforms Other LLMs By "Stopping To…

ARK Investment Management LLC 6 个月前

Crash Course on Developing AI Applications with…

Alex Merced 1 个月前

https://api.groq.com/openai/v1

If only we could trick Semantic Kernel to use this endpoint instead of the standard OpenAI endpoints, we should be good. There are some limitations compared to the full OpenAI / Azure OpenAI Service, but the basics should work.

The Code

Semantic Kernel doesn't allow us to change the base URL of the service it uses - but it does allow injecting a custom HttpClient that it will use for the requests. So, if we can somehow hijack the Http calls and change the URL's, we should be fine.

We can do this by creating the HttpClient with a custom delegate handler, and using that for the OpenAI ChatCompletion service:

HttpClient httpClient = new(new CustomDelegatingHandler());
kernelBuilder.AddOpenAIChatCompletion("llama3-70b-8192", key, httpClient: httpClient);

The actial CustomDelegateHandler looks like this:

public class CustomDelegatingHandler() : DelegatingHandler(new HttpClientHandler())
{
    protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
    {
        request.RequestUri = new Uri(request.RequestUri.ToString().Replace("https://api.openai.com/v1", "https://api.groq.com/openai/v1"));
        return await base.SendAsync(request, cancellationToken);
    }
}

(The key here is that we intercept any http request, and replace the "api.openai.com/v1" with "api.groq.com/openai/v1" in the url).

If you want to see a more complex example, I have published a complete chat sample with Semantic Kernel and Groq at https://gist.github.com/vbandi/c598232952729a1828374fb76943cfcd

要查看或添加评论，请登录

András Velvárt的更多文章

Junior Developers in AI World

2025年3月17日

Junior Developers in AI World

This is not an AI-generated post. It is an AI-inspired post, more precisely, the gist of a discussion I had.

3 条评论
Hot Take: OpenAI's First Developer Keynote

2023年11月6日

Hot Take: OpenAI's First Developer Keynote

OpenAI has just finished the Keynote of it's first developer conference - and it was packed with announcements worth…

4 条评论
Using OpenAI Function Calling in .NET

2023年8月7日

Using OpenAI Function Calling in .NET

Introduction OpenAI introduced Function Calling in June, and it immediately piqued my interest. Function Calling is a…

10 条评论
How will AI Affect User Interfaces?

2023年7月3日

How will AI Affect User Interfaces?

Over the past few decades, the user interface of IT devices has experienced numerous groundbreaking innovations. The…

2 条评论
First Comparison of Google's PaLM2 and ChatGPT

2023年5月10日

First Comparison of Google's PaLM2 and ChatGPT

As any red-blodded AI tinkerer would, upon seeing Google's PaLM2 announcement, I had to see how it compares to ChatGPT…

4 条评论
Artificial Intelligence - the Next Technological Revolution that will Upend Everything

2023年4月5日

Artificial Intelligence - the Next Technological Revolution that will Upend Everything

It is not easy to surprise Bill Gates - but the OpenAI team managed to do so. Last September, the Microsoft founder’s…

8 条评论

See all articles

Using Groq from Semantic Kernel

András Velvárt

AI Tinkerer. Head of AI at HexaIO. AR/VR, HoloLens Consultant. 17 times Microsoft MVP. CEO at Response Ltd. Pluralsight and LinkedIn Learning Author, International Speaker.

What is Groq?

What is Semantic Kernel?

The Problem

The solution

领英推荐

The Code

András Velvárt的更多文章

社区洞察

其他会员也浏览了

Issue #289 - The ML Engineer ??

??Top ML Papers of the Week

A Guide to Building RAG

Mastering the Ingestion Phase of Retriever Augmented Generation (RAG)

This 32B Open-Source DeepSeek Distilled Model outperforms OpenAI's o1-mini! ??

OpenAI’s o3?mini: A Masterstroke or a Market Manipulation? The ROI Gamble That’s Rattling Boardrooms

Issue #221 - THE ML ENGINEER ??

AutoGen and Semantic Kernel: Multi-Agent AI Development

LangChain State of AI 2024: A Comprehensive Analysis

On the 12th Day of Christmas, OpenAI gave us o3

What is Groq?

What is Semantic Kernel?

The Problem

The solution

领英推荐

The Code

András Velvárt的更多文章

Junior Developers in AI World

Hot Take: OpenAI's First Developer Keynote

Using OpenAI Function Calling in .NET

How will AI Affect User Interfaces?

First Comparison of Google's PaLM2 and ChatGPT

Artificial Intelligence - the Next Technological Revolution that will Upend Everything

社区洞察

其他会员也浏览了

Issue #289 - The ML Engineer ??

??Top ML Papers of the Week

A Guide to Building RAG

Mastering the Ingestion Phase of Retriever Augmented Generation (RAG)

This 32B Open-Source DeepSeek Distilled Model outperforms OpenAI's o1-mini! ??

OpenAI’s o3?mini: A Masterstroke or a Market Manipulation? The ROI Gamble That’s Rattling Boardrooms

Issue #221 - THE ML ENGINEER ??

AutoGen and Semantic Kernel: Multi-Agent AI Development

LangChain State of AI 2024: A Comprehensive Analysis

On the 12th Day of Christmas, OpenAI gave us o3