登录查看更多内容

ChatGPT with your own data

Srdjan Stojadinovic

-

发布日期: 2024年1月6日

+ 关注

Have you ever considered building chatbot like ChatGPT but on your own data?

Let's use FlowiseAI? to build a chatbot LLM app (Large language model).

FlowiseAI is an open-source platform powered by LangChain (open source framework) to develop LLM applications.

For training purposes, I have chosen MicroStrategy’s (my favorite company) annual report for 2022 in PDF format.

To start with you will need 2 API keys. One for ChatGPT and the second for a vector database - for example Pinecone, but it could be any other vector database (See short description for Vector database and vector search).

To install FlowiseAI refer to its documentation pages.

After succesful installation of FlowiseAI type https://localhost:3000, and you are ready to go.

To jumpstart with your first LLM app, use one of the provided chartflow templates from Marketplace (left side menu).

To build chatbot I would suggest to use “Conversational retrieval QA Chain”.

I have replaced the text file “document loader” with PDF “document loader” component because I have experienced when querying the chatbot with the text “document loader”, it could not return structured data, e.g., a table.

When I used PDF file, I could prompt chatbot with following and get result in table format:

“Please summarizes employee headcount for each year, group by department.”

Her are some prompts used for querying chatbot:

Who is the co-founder of Microstrategy?
What is revenue for year 2021?
Summarize text about Microstrategy?
Who are key executives of Microstrategy?
What is revenue for 2022?
How many bitcoin own Microstrategy (date of annual report 2022)?
Please summarizes employee headcount for each year group by department?
Who are customers?
List Key Differentiators?
Please list Key Differentiators for MicroStrategy long explanation?
Please list main competitors of Microstrategy in bullet form.

********************************************************

Filip Rais 10 个月前

Using ChatGPT to Explore Claims Databases

Bryce Sady 11 个月前

Train ChatGPT To Be Your Dataviz Co-Pilot

Will Sutton 1 年前

For future readings and comprehensive information and resources on LangChain, a framework for building applications with Large Language Models (LLMs), you can explore the following websites:

1. LangChain Official Website: The main site for LangChain, offering a complete overview of the framework, including its capabilities and use-cases. It provides links to documentation, blogs, and various LangChain products like LangSmith, Retrieval, and Agents.

2. LangChain Blog: This blog offers insights and updates on LangChain, discussing various aspects of its implementation and use in different projects.

3. GitHub - LangChain: For those interested in the technical and developmental side of LangChain, the GitHub repository is an invaluable resource. It houses the framework's codebase, issue tracking, and collaborative tools for developers.

4. AWS - What is LangChain?: Amazon Web Services provides an explanation of LangChain, highlighting its purpose, components, and applications in the context of LLMs.

5. Nanonets - LangChain Guide & Tutorial: Nanonets offers a comprehensive guide and tutorial on LangChain, which can be particularly helpful for those looking to understand how to effectively use this framework for developing intelligent applications.

These resources collectively offer a broad and detailed view of LangChain, from its basic principles to more complex applications and community contributions. Whether you're a developer looking to implement LangChain in your projects, or simply interested in learning more about this framework, these sites provide valuable information and insights.

Description for Vector database and Vector search:

Vector database is A vector database is a type of database designed specifically to handle vector embeddings typically used in machine learning and similar applications. These databases are optimized for storing and rapidly retrieving vectors, which represent complex data points like text, images, or sounds in a format that machines can understand. Vector databases facilitate efficient similarity searches, allowing for quick and accurate retrieval of items based on their content rather than just metadata or keyword matches. They are crucial in powering applications that require fast and semantically accurate search capabilities, like recommendation systems or semantic text searches.

Vector search refers to the method of searching through a database or collection of data by converting the query and the documents into vectors in a high-dimensional space. It calculates the similarity between the query vector and the document vectors, often using measures like cosine similarity. This approach is particularly effective for tasks like semantic search, where the goal is to find the most relevant items based on the meaning of the query, rather than exact keyword matches. It's widely used in information retrieval, natural language processing, and similar applications.

WARNING:

It is not safe to post sensitive or confidential content on ChatGPT or any other public conversational AI. OpenAI, the organization behind ChatGPT, advises users against sharing any sensitive, confidential, or personally identifiable information while interacting with the model. The data shared can potentially be used for model training purposes and, while there are safeguards in place, absolute privacy and confidentiality cannot be guaranteed. Always refrain from sharing anything that you wouldn't want to be public or that could lead to harm if disclosed. If you have sensitive content to work with, consider using localized or private instances of AI models, and always consult with the respective privacy policies and terms of service.

Ivan Erik Kragh

Principal BI Architect & Owner at Insights

9 个月

Super interessant case Srdjan ??

1 次回应

要查看或添加评论，请登录

Srdjan Stojadinovic的更多文章

T-SQL and JSON: Import cryptocurrency API data in 8 easy steps

2021年2月14日

T-SQL and JSON: Import cryptocurrency API data in 8 easy steps

Purpose with this short article is to show basics of downloading and importing JSON files with T-SQL. This would create…
Hyperledger Fabric concepts

2020年2月20日

Hyperledger Fabric concepts

https://www.hyperledger.
Calculating Moving Average in SQL

2019年9月16日

Calculating Moving Average in SQL

Definition Moving Average is the classical method of time series decomposition to estimate the trend-cycle. You can…

1 条评论
Analytical SQL examples - Running total

2019年8月26日

Analytical SQL examples - Running total

Running total Running total is a very handy and useful tool. Definition From Wikipedia: “Running total is the summation…

ChatGPT with your own data

Srdjan Stojadinovic

-

领英推荐

Srdjan Stojadinovic的更多文章

社区洞察

其他会员也浏览了

From Knowledge Graph to Data Mesh, an interview with OpenAI’s ChatGPT

The Confident Wrongness of ChatGPT

Leveraging ChatGPT to Generate MuleSoft DataWeave Script from Source and Target Data Structures

The query access language for the language...

Code Interpreter comes to all ChatGPT Plus users: 7 ways it may threaten data scientists

Don't use ChatGPT for data analysis

Are Business Intelligence Professionals prepared for the age of ChatGPT?

Five Things I Learned Writing SQL with Gen AI

Building a Data Mesh Platform: ChatGPT as a Game-Changer for Data Product Delivery

Big Data & Analytics - Thinks and Links | July 15, 2023

领英推荐

Srdjan Stojadinovic的更多文章

T-SQL and JSON: Import cryptocurrency API data in 8 easy steps

Hyperledger Fabric concepts

Calculating Moving Average in SQL

Analytical SQL examples - Running total

社区洞察

其他会员也浏览了

From Knowledge Graph to Data Mesh, an interview with OpenAI’s ChatGPT

The Confident Wrongness of ChatGPT

Leveraging ChatGPT to Generate MuleSoft DataWeave Script from Source and Target Data Structures

The query access language for the language...

Code Interpreter comes to all ChatGPT Plus users: 7 ways it may threaten data scientists

Don't use ChatGPT for data analysis

Are Business Intelligence Professionals prepared for the age of ChatGPT?

Five Things I Learned Writing SQL with Gen AI

Building a Data Mesh Platform: ChatGPT as a Game-Changer for Data Product Delivery

Big Data & Analytics - Thinks and Links | July 15, 2023