登录查看更多内容

AI Researcher - What it is and why we built it!

Santanu DasGupta

发布日期: 2024年6月25日

Note: Next week we plan to release a Prompt Chain Builder to make it easier to build AI Apps. Please stay tuned! In the meantime, here I talk about our AI Researcher - what it is and why we built it!

What is the Meerkats AI Researcher and why we built it!

To leverage the power of GenAI for content creation, giving it factual and detailed information is key, and it comes only with good research.

Without good research, all the downstream tasks fall apart. To write an insightful article, you need great research - probably the most time consuming part of writing anything.

I found perplexity’s answers were accurate but lacked the depth. AI should be able to figure out what are the related queries that need to be researched upon to fetch me detailed inisghts.

For example: If I want to research on “Latest SEO algorithm leak by Google,” the AI should figure sub-queries such as: What does it mean for SEO agencies, what are the key points to be noted, what are key points to pick for better SEO and so on.

Also, if I want answers from any target sites, which I trust, it but should be able to aggregate answers from those sites e.g. Yahoo or Techcrunch, Bloomberg etc.

Once you have the high quality research, it’s easier to repruspose it to an engaging blog or a Linkedin post by adding a unique perspective - high quality research is the 1st step.

Key Goals of the AI Researcher:

Generated documents should be accurate
Research should be fast
Users should have a simple, easy to use interface to get the desired results.
Finally, it should be scalable

Usability goals:

Ensure that while the report was being generated, it would keep the user updated on the progress of the reports and that includes mentioning the sites being scanned for the answer.
The document generated should be in mark down format and easily be edited in a Notion like interface. Like, expand text, shorten the text or rewrite a section using AI itself.
Integrations to publish the article on LinkedIn or a blog easily.

Let’s dive into tech stack used and structure

The key stack used are: Google SERP API, Web Scraper and LangChain agents and OpenAI. Everything was written in NodeJs as the core backend is in NodeJS.

At a high level, the process can be broken down into 3 steps.

Find relevant sources of information using Google Search

Scrape the content in the links

Assemble all the information into the report using openAI.

领英推荐

Amazon SEO Trends to Watch in 2025: Strategies for…

SUPERBVA LTD 2 个月前

Is AI the End of SEO? The Future of Search…

DIGI CONVO 1 年前

12 Top Strategies For Ranking on AI Search Engines in…

Digital Results 2 个月前

Google Search API

We used Google Search API to perform multiple searches and aggregated the URLs that google returned and marked them for scraping.

Cheerio

We used Cheerio to scrape the documents stripping out all the. unnecessary information. We eliminated Javascript, CSS and just focused on the HTML tags and returned in markdown format.

LangChain to find relevant Documents

LLM ready mark down data from the scrapes was converted into embeddings that were converted an finally sent to OpenAI for assembling the report from all embeddings of the markdown data. A key step was also identifying the most relevant documents and sections from the relevant URLs as many may not have been relevant in answering the query. We used the LLMContextRetriever to find the most relevant documents.

Open AI

We used function calling to call tools, aggregate all the information and generate a nice detailed report.

Challenges we ran into

Initial versions the reports came with significant hallucinations.

The key challenges were related to getting the scraped URLs ready for the LLM by stripping out the javascript tags and outputting in Markdown format.

Another challenge that we had to address was making the queries run in parallel so that time that the user has to wait for the report is minmized. Here timeouts and retries were causing a lot of delays in the report generated and we spent quite a bit of time stripping out certain document types etc. so that it didn’t waste time extracting information that would take a lot of time.

Side Note: Managing Javascript worker threads were also causing timeouts and inconsistencies which we were time consuming to debug.

We tried using Beautiful Soup as a scraper but Cheerio turned out to be easier and more accurate in terms of extracting mark down data. Hence, we decided to implement the entire scraping module in Cheerio.

Future Plans

Key features that we are planning to add is ability to provide a Table of Contents to the researcher so it can do research on individual subtopics and create a detailed report of up to 5K words that can be used for creating high quality blogs, newsletters etc. For this we're using Langchain and increasing the number of AI agents to handle the complexity.

Also, we are currently working on providing an easy to way to include your POV and customize the research content into high quality SEO ready article.

I am attaching the link to the tool here: https://apps.meerkats.ai/playground/crux-ai

Please do try out Crux AI researcher and let me know your thoughts. Next week we talk about our Prompt Chain Builder!

Gen AI for Lead Generation

341 位关注者

Dave Holmes-Kinsella

Builder | Analytics & Data Leader: Strategy, Architecture, Build, Launch | From pre-A to post-IPO | 2 Exits | Former Synctera, Facebook

8 个月

I love the idea of the table of contents. Much as in a academic paper it would be great to see attributions and sources.

要查看或添加评论，请登录

Santanu DasGupta的更多文章

Does Google Hate AI-Generated Content?

2024年7月22日

Does Google Hate AI-Generated Content?

#aisaas #aicontent After the HCU (Helpful Content Update) rolled out by Google earlier last year, there is widespread…

1 条评论
Where is the Revenue - Is GenAI Headed for a Crash?

2024年7月8日

Where is the Revenue - Is GenAI Headed for a Crash?

Sequoia Capital’s David Cahn recently put out a thought provoking article on the revenue gap in GenAI. https://www.
The Illusion of Effortless Content Creation with AI: A Content Marketer's Guide to Prompt Chaining

2024年7月1日

The Illusion of Effortless Content Creation with AI: A Content Marketer's Guide to Prompt Chaining

Artificial intelligence (AI) has been presented as a magician, promising to pull effortlessly stunning content out of…
How to avoid fluff in AI Content - Deep Research and Prompt Chains!!

2024年6月29日

How to avoid fluff in AI Content - Deep Research and Prompt Chains!!

Who hasn’t felt that the content generated by ChatGPT is shit! 9 out of 10 times it’s bland generic content which will…
Why SaaS Founders Should Prioritise Content Market Fit Before Product Market Fit?

2024年6月24日

Why SaaS Founders Should Prioritise Content Market Fit Before Product Market Fit?

I recently launched an AI SaaS, I began by doing the user survey, built features users requested and I was super…

1 条评论
How to create good content: A framework with a checklist

2024年6月21日

How to create good content: A framework with a checklist

#aicontentmarketing #aiagents #aiformarketing #aisaas #aiforcontent Creating good content is an essential skill in…

2 条评论
3 Key things I learned building a Gen AI Content Automation Platform

2024年6月21日

3 Key things I learned building a Gen AI Content Automation Platform

Keeping things simple and manageable GenAI and LLMs are adding capabilities everyday. New vision, text, audio models…
Apple Intelligence: A Game-Changer in the AI Race

2024年6月12日

Apple Intelligence: A Game-Changer in the AI Race

Introduction On June 10, 2024, Apple made a significant splash at the Worldwide Developers Conference (WWDC) by…

1 条评论
SEO to GEO: A Key trend every marketer needs to know

2024年6月8日

SEO to GEO: A Key trend every marketer needs to know

Introduction In the rapidly evolving landscape of digital marketing, 2024 marks a significant shift with the advent of…

2 条评论

See all articles

AI Researcher - What it is and why we built it!

Santanu DasGupta

What is the Meerkats AI Researcher and why we built it!

Key Goals of the AI Researcher:

Usability goals:

Let’s dive into tech stack used and structure

领英推荐

Google Search API

Cheerio

LangChain to find relevant Documents

Open AI

Challenges we ran into

Future Plans

Gen AI for Lead Generation

341 位关注者

Santanu DasGupta的更多文章

社区洞察

其他会员也浏览了

The Monthly Serve: All in for AI

Here’s Why GEO is the New SEO

Latest SEO news you need to know - March 2023

Chats Over Scrolls in the Battle of AI Sass vs. SEO Tyranny

Cracking the Code of GPT Ranking: The Only Guide You Need

5 SEO Trends to Look Out For Next Year

Opinions About Machine Learning for SEO

??Notes from Google CEO: The Future of Search Engines in the Age of AI

Top 8 SEO Strategies for 2025: Staying Ahead in an AI-Driven Search Landscape

Elevating Custom GPTs with Advanced URL Scraping

What is the Meerkats AI Researcher and why we built it!

Key Goals of the AI Researcher:

Usability goals:

Let’s dive into tech stack used and structure

领英推荐

Google Search API

Cheerio

LangChain to find relevant Documents

Open AI

Challenges we ran into

Future Plans

Gen AI for Lead Generation

341 位关注者

Santanu DasGupta的更多文章

Does Google Hate AI-Generated Content?

Where is the Revenue - Is GenAI Headed for a Crash?

The Illusion of Effortless Content Creation with AI: A Content Marketer's Guide to Prompt Chaining

How to avoid fluff in AI Content - Deep Research and Prompt Chains!!

Why SaaS Founders Should Prioritise Content Market Fit Before Product Market Fit?

How to create good content: A framework with a checklist

3 Key things I learned building a Gen AI Content Automation Platform

Apple Intelligence: A Game-Changer in the AI Race

SEO to GEO: A Key trend every marketer needs to know

社区洞察

其他会员也浏览了

The Monthly Serve: All in for AI

Here’s Why GEO is the New SEO

Latest SEO news you need to know - March 2023

Chats Over Scrolls in the Battle of AI Sass vs. SEO Tyranny

Cracking the Code of GPT Ranking: The Only Guide You Need

5 SEO Trends to Look Out For Next Year

Opinions About Machine Learning for SEO

??Notes from Google CEO: The Future of Search Engines in the Age of AI

Top 8 SEO Strategies for 2025: Staying Ahead in an AI-Driven Search Landscape

Elevating Custom GPTs with Advanced URL Scraping