The Evolution of Hubz
by DataHubz

The Evolution of Hubz

Sharing some hot news!?

When we first designed the DataHubz SaaS Platform in October 2024, we defined Hubz as a data management/data analytics AI assistant. Basically, Hubz was designed to operate the platform's features on behalf of the user.?

Contrary to what many people think, even with AI-powered data solutions, LLMs, although extremely resourceful and important, are not the major part of this type of platform. So we were more concerned about going in the right direction for our particular case concerning LLMs than just jumping on the hypertrain as soon as we could.

We first worked with Gemini 1.5 Pro, with a 128k token context window (which was super helpful at the time) and over 200B parameters (Gemini 1.5 Pro has now over 2M of context window size). We then worked, simultaneously, with ChatGPT 4o (128k token context window, 1.8T parameters!!) and Claude 3.5 Sonnet (200k context window, and a publicly unknown number of parameters. Speculations lie between 175B and 400B).?

We assessed the role and quality of the above LLMs in context-specific scenarios. It is important to understand this. We are not evaluating these tools as general solutions.?

Our assessment considered the following categories: instruction following, truthfulness, relevance, coherence/consistency, completeness, and compliance (including both standard/global and custom/local guidelines). We didn’t test for creativity but we plan to do that very soon since there is a debate on how to define, perceive, and assess creativity.?We developed a straightforward score system which ended up revealing some patterns we later used to find the best solution for our purposes (considering current expectations). We don't plan to disclose the score system let alone the scores at the moment.

Among the lessons learned from our experiments, we could not see even a remote correlation between number of parameters (number of trainable weights) and high scores in the categories we considered for context-specific/context-aware tasks/challenges, which is fairly disappointing if considering how much larger ChatGPT 4o is in comparison with Gemini 1.5 Pro and Claude 3.5 Sonnet by parameter count. In fact, Claude 3.5 Sonnet almost consistently performed much better than Gemini 1.5 Pro and ChatGPT 4o for context-specific/context-aware tasks/challenges. In contrast, Gemini 1.5 Pro almost consistently provided the poorest results (among these three options).

One day, OpenAI and Claude both went offline for many hours. We are not sure what happened but that was annoying, to say the least. We were left with Gemini 1.5 Pro, the one LLM that was scoring the lowest for our purposes. That was it for us. The potential vendor lock problems were already bothering us a lot. That outage was the last push we needed to rethink our strategy (as we were already considering not using external APIs).

We then decided to give Llama 3.2 a try and that changed everything. Right off the bat, Llama yielded results superior to Gemini 1.5 Pro and comparable to ChatGPT 4o and Claude 3.5 Sonnet (again, for context-specific tasks). This alone is a remarkable result for us since Llama 3.2 has only 3.2B parameters, 131K context window with an embedding length of 3,072, and quantization of just 4 bits!! But in our first round of efforts, we could not demonstrate superiority in any particular case.?

To keep things in perspective and highlight how impressive this result is, ChatGPT 4o is about 562.5 times larger than Llama 3.2 by parameter count.

As we advanced with the development of the DataHubz platform, Hubz started to extrapolate its role as an assistant and started to look more like a sub-AI platform. A platform within the scope of another platform (talk about inception).?

Without giving away too much, here is what we did: we created two types of orchestration mechanisms:

  1. “Mechanical” orchestrations (which we refer to as “components”) and
  2. “Intelligent” orchestrations (which we refer to as “AI agents”).

The components are the operational building blocks of Hubz as an AI platform and the AI agents are engines that elevate the quality of the results produced through reasoning and automation. We added persistence and configurations across the solution (as well as a centralized knowledge base), and we established the notion of context as one of the central concerns of the platform.?

The result, for us, is nothing short of exciting: we are producing a stateful and context-aware AI platform.?

Recall that at the moment, we are not interested in general activities. Other AI solutions will probably be better for more generic tasks (or at very least, out of our scope). But for the contexts we are concerned with, Hubz can produce superior results in virtually all context-aware tasks with the borderline being comparable results with leading (and extremely expensive) LLM (and/or LLM-powered) solutions.?

What more could we ask for??

Due to this result, we are now decoupling Hubz from the DataHubz platform so people can use it regardless of being an enterprise user of the platform.?

Keep in mind that Llama 3.2 is a tiny LLM (emphasis on the tiny). It runs on any personal computer. Think about the carbon footprint of this solution, the overall cost savings, and the scaling opportunities (we are far from exhausting the possibilities here). Where is then the "secret sauce"? In the orchestration layer, the stateful nature of the platform, and the context-aware knowledge base.

If you are curious, our supported contexts right now only include data management, data analytics, and data visualization. But these results are the confirmation we needed to expand to more contexts. Hubz works for everything but supported contexts lead to superior results.

We plan to start releasing beta access to Hubz in the next couple of weeks.

#startup #founder #data #ai

要查看或添加评论,请登录

David William Silva, PhD的更多文章