Hello Dolly: Databricks Open Sources ChatGPT Alternative!

There isn’t a dull moment at Databricks. From the creators of Apache Spark, Delta, MLflow and the pioneers of Lakehouse - “Dolly” is here.

ChatGPT took us all by a pleasant surprise, we know what Data and AI can do, we are in a generational transition in adoption of AI and digital transformation of our enterprises and our lives.

Why is this a big deal?

  • We’ve cracked the code on what is “minimally” required to model human like interaction. You don’t need millions machines and peta bytes of data.

We show that anyone can take a dated off-the-shelf open source large language model (LLM) and give it magical ChatGPT-like instruction following ability by training it in 30 minutes on one machine, using high-quality training data. Surprisingly, instruction-following does not seem to require the latest or largest models: our model is only 6 billion parameters, compared to 175 billion for GPT-3. We open source the code for our model (Dolly) and show how it can be re-created on Databricks

  • Dolly is Open Source. This is important and is in our DNA to democratize Data and AI. There is no secret sauce and now anyone can build their usecases on this with transparency on training dataset and cost.

In contrast, ChatGPT although impressive is a general purpose model albeit expensive and proprietary.

ChatGPT, a proprietary instruction-following model, was?released?in November 2022 and took the world by storm. The model was trained on trillions of words from the web, requiring massive numbers of GPUs to develop. This quickly led to Google and other companies releasing their own proprietary instruction-following models.

Watch our CEO Ali Ghodsi talk about it on Bloomberg

要查看或添加评论,请登录

社区洞察

其他会员也浏览了