Reimagining Data Teams in the era of Generative AI
Reimagining data teams in the era of Generative AI

Reimagining Data Teams in the era of Generative AI

Innovative data teams are already improving their productivity with the assistance of Generative AI (GenAI) using ChatGPT, GitHub Copilot, Gemini, etc.

From data collection to deployment there is a place for Large Language Models (LLMs) in your data team.

The Data Science Process - Credit Chanin Nantasenamat

The journey often starts with a time-sensitive request from the business team:

Hi team, could someone help me understand all our expenses larger than $1K broken down by the account manager in the last quarter? Please send ASAP before the budget review tomorrow, thanks!

If we are lucky, the data might already be in the database so we can skip collection, cleaning, to focus on analysis and deploying / sharing the results to fulfill the request.

In the era of ChatGPT, we see modern data teams using the following components:

  • LLM: Most teams are using OpenAI through ChatGPT or GitHub Copilot; but we see a few using open source LLMs and Google's Gemini.
  • Context: Teams are creating text files to describe their data sources, requirements, practices, etc. It contains table names, column names, and column types. SQL examples of previous questions can also be useful.
  • Prompt: Non-technical users provide the high-level prompt / requirement, but data teams breakdown or enhance those prompts to make them technology aware and refine them to solve the problem at hand

Best Practices

The workflow today is mostly manual, copy-paste the context into the LLM to make it aware of our data context, craft the prompt, wait for the code to generate (usually SQL), copy-paste the code back to your data analytics platform to deploy a dashboard that answers this question to the stakeholder.

The current practices leave a lot of room for improvement, we can do better.

First, we need to stop copy-pasting context files around and start improving them together. One easy solution is to check-in the context file into GitHub and collaborate in GitHub directly.

Second, collect completion data, data is gold! This will be very useful when your team starts fine-tuning models. One easy way is to log the arguments and responses sent to your LLM API, but beware that it takes months to collect significant data so double-check your retention policies.

Third, you can consider prompt frameworks like DSPy to help your team write better, more reusable prompts.

Reimagining Collaboration

However, we can do so much better, by reimagining collaboration.

In the era of Generative AI, we can't afford to be the bottleneck of the organization, copy-pasting into LLMs and waiting for results to pop -- business users demand instant answers to compete in the market. Business users want to use Midjourney to produce reasonable designs in seconds, answer general questions instantly with ChatGPT, and produce data-driven insights themselves. The game has changed.

Innovative data teams know this, Retrieval Augmented Generation (RAG) is one of the top skills being developed, but that's just the start.

We believe the future of data teams is to provide conversational experiences connected to their private data (databases, data lakes, services, documents, etc) to democratize data and maintain their competitive edge.

The job of the data team is no longer to produce insights, but to build the machine that builds the machine.

At Hal9 we help data teams provide conversational data experiences to your business users, but there are other resources and initiatives worth considering as well like PrivateGPT, GPT4All, LocalGPT and the like.

Reach out to us at Hal9, request a demo, see you soon!




要查看或添加评论,请登录

Javier Luraschi的更多文章

  • Bridging the Business Data Gap with Generative Analytics

    Bridging the Business Data Gap with Generative Analytics

    In today's digital economy, data stands as the cornerstone of organizational success, driving decisions that optimize…

  • ChatGPT at Gartner Analytics Summit

    ChatGPT at Gartner Analytics Summit

    The Year of ChatGPT Before we get to ChatGPT, the opening keynote of the event made it clear that "Skills and staff…

    7 条评论
  • AI for Growing Businesses

    AI for Growing Businesses

    As an early-stage business, thinking of implementing Artificial Intelligence (AI) solutions like Tesla’s Full…

    1 条评论

社区洞察

其他会员也浏览了