Data is Dead ??

Data is Dead ??

The data defence is broken. Data accumulation was once considered a moat for companies that slowed down the competitors from catching up. The development of LLMs has diminished the value of data hoarding. For instance, OpenAI, which took years to build ChatGPT, witnessed a competitor in the form of Google’s Bard, which was trained on the conversational data of ChatGPT, as reported by the media. Bizarrely odd, right?

Similarly, Alpaca was built by fine-tuning Meta’s LLaMA using OpenAI’s text-davinci-003 data. Another LLaMA-model Vicuna, a 13-billion parameter model, was trained on fine-tuned data from a LLaMA base model using conversations gathered from ShareGPT as well.?

While OpenAI spent years and millions of dollars to train GPT models and create GPT-3.5 and GPT-4, the new platforms are using OpenAI's data to train their models.

Andrew Ng, the former head and co-founder of Google Brain questioned the defensibility of the data moat. He observed that the first mover advantage has no meaning in LLMs as developers and researchers who want to build specialised or GPT-anything applications can skip several tedious and costly steps in the process – there’s no need to collect or label datasets. So, what’s the need to spend resources to build a model trained on large, labelled training datasets of their own? And what is the real significance of a data moat now?

However, all this has further deepened the debate around the ethics of using others’ data to build models. While it may seem logical for some to cut costs by using pre-existing foundational models, it's comparable to taking a recipe without creating it yourself. In other words, it's like stealing intellectual property instead of putting in the effort to develop your own.

Read the full story?here.


TCS Witnesses Slump

Tata Consultancy Services (TCS) yesterday announced its Q4 results with revenue growth of 10.7%. The company’s growth, as bad as the others, went down by 3.6% compared to the previous quarter of the last year (Q4 2022). The company has expressed concerns over the future of its banking, financial services, and insurance segment in the North American market, citing the urgent need for clients to conserve cash. It also acknowledged the Silicon Valley Bank collapse has had an impact on the client sentiments in America affecting the business of Indian IT companies.

Read the full story?here.


Vicuna vs Alpaca

Large Language Models (LLMs) are now taking different forms. Once considered expensive and difficult to train, LLMs can now be trained at a very low cost. Vicuna-13B and Alpaca are built on LLaMA using OpenAI’s data. The cost of training Vicuna’s 7B and 13B parameters is $140 and $300, respectively. On the other hand, Alpaca’s 7B parameters require $500 for data and $100 for training. As per GPT-4, Alpaca scored 7/10 and Vicuna-13B got a 10/10 in ‘writing’.?

Read the full story?here.


India Builds Suite of GPTs

When the world is rushing towards building one super-effective LLM-based platform after another, Indian companies are also making attempts to create something as effective. We have been able to successfully develop many kinds of GPTs in the past few months. BharatGPT, Chatsonic, GitaGPT and others. Besides, some established players in the chatbot segment like Gupshup, Haptik, Verloop.ai and others have also integrated the latest technology to remain relevant in the fast-changing market.

Read the full story?here.

要查看或添加评论,请登录

AIM Events的更多文章

社区洞察

其他会员也浏览了