A sad story of tf-idf
Vinay Mehendi, PhD
Technographics| Competitive Intelligence | Creating the world's most data-centric sales pipelines
The tf-idf approach/model/formula is the simplest of natural language programming models available. It is text mining 101. This model helps you identify the most important words in a book, or in a a data lake of text data. Some might say that this model is the foundation of natural language processing.?We used it for many projects when "unstructured data" used to be a term.
Things have changed since then. The term tf-idf has died. It is available just like any other piece of code is.
The most famous words of the year award goes to either one of them:
I would be honest that I had not heard of these words before ChatGPT.
Hence, we embarked on a mission to see the trends of tf-idf and LLMs in job descriptions on career websites and individual websites. We analyzed more than 3 million jobs of last 6 months.
We picked technology keywords (and not technologies) to understand the market behavior.
Observation # 1
# of AI skills required by companies have significantly gone up in the last 6 months.
Observation # 2
LLM and Large language models have become the talk of the town. Companies are investing in these skills.
LLM is the word of the year.
What's next at AI Pulse ?
We are coming up with many AI topics in coming days related to Open Source, GPUs consumption.
ASK
Mission 200 millions Domains
You might be aware that we are on the mission to increase the company count to 200 million. We added 30K companies today. It takes to 25.5 million companies now.
We would bring another topic for you tomorrow.
Until then,
Be safe, sleep well.
-Vinay
Senior Consultant, AI Transformation Consulting | SIBM Pune '23
2 个月Nostalgic moment! Great memory of using the Tf-Idf model during my stint in Oceanfrog. Haven't heard anyone using tf-idf in any conversation around AI. Srishti .