GenAI Weekly — Edition 7
Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs
Stay at the forefront of the Gen AI revolution with Gen AI Weekly! Each week, we curate the most noteworthy news, insights, and breakthroughs in the field, equipping you with the knowledge you need to stay ahead of the curve.
Cohere launches Command R+, a powerful enterprise LLM that beats GPT-4 Turbo
Cohere, a leading provider of enterprise-grade AI solutions, today announced the launch of Command R+, its most advanced and scalable large language model (LLM) designed specifically for real-world business applications. The new model builds upon the strengths of its predecessor, Command R, while offering enhanced performance, multilingual support and advanced retrieval augmented generation (RAG) capabilities.
Command R+ is optimized for enterprise use cases, providing best-in-class RAG with citation to reduce inaccuracies, multilingual coverage in 10 key business languages and a powerful Tool Use API for automating complex workflows. The model outperforms similar offerings in terms of RAG, multilingual capabilities and tool use, while maintaining Cohere’s commitment to data privacy and security.
Reducing inaccuracies, multilingual support and an API to support complex workflows: businesses need this and more. Going from general purpose LLMs to ones that are more tuned to business use cases should make it more amenable selling into businesses since there is so much consumer-bias to LLMs in general.
The 18 most interesting startups from YC’s Demo Day show we’re in an AI bubble
Springtime means rain, the return of flowers and, of course, Y Combinator’s first demo day of the year. During the well-known accelerator’s first of two pitch days from the Winter 2024 cohort, a covey of TechCrunch staff tuned in, took notes, traded jokes and slowly whittled away at the dozens of presenting companies to come up with a list of early favorites.
AI was, not shockingly, the biggest theme, with 86 out of 247 companies calling themselves an AI startup, but we’re reaching bubble territory given that 187 mention AI in their pitches.
From AI-generated music and grant applications to neat new fintech applications and even some health tech work, there was something for everyone. We’re back at it Thursday for the second day of pitches. Until then, if you didn’t get to watch live, here’s a rundown of some of the best from day one.
A bubble can be only known in hindsight. Until then, let’s get the popcorn out.
In the age of increasing context sizes, do vector databases have a future?
In the past few months it feels like two schools of thought have emerged in the online discourse: gazillion-token context windows will fix everything and make language models more accurate and efficient; and retrieval augmented generation (or RAG) will fix everything and make language models more accurate and efficient.
There are merits on both sides, and the reality as usual is probably somewhere in the middle. But the RAG case—the path of least resistance for most enterprises, for a bunch of reasons we’ll get into in a bit—necessitates having all that information they want available for language models in a different format. Specifically, that data, like documents or Slack messages, has to be converted to a more unified vector format with an embedding model and stored in a convenient place for retrieval.
领英推荐
That’s led to the growing importance of vector databases. And the growing need for the format to feed info-hungry prompts has blossomed into a whole ecosystem of both startups and larger companies bolting vector search onto their key products. And it’s led to one of the most competitive and fascinating races in the story arc of AI.
“While we do expect the length of context windows to continue to increase, that won’t nullify the need for RAG,” Brittany Walker, general partner at CRV, told me. “We believe RAG and long-context windows complement each other rather than compete with each other. RAG is efficient and performant, and retrieval helps the LLM focus on the right information. Long-context windows enable the LLM to process more context for one particular query.”
128k token context windows and even 1M token context windows did make a lot of folks worry about the future of the vector database. But, it looks like all is good.
Opera browser adds support for local LLMs
Browser innovator Opera announced today that it’s adding experimental support for 150 local LLM (Large Language Model) variants from approximately 50 families of models to its Opera One browser in developer stream. This step marks the first time local LLMs can be easily accessed and managed from a major browser through a built-in feature.
The local AI models are a complimentary addition to Opera’s online Aria AI service. Among the supported local LLMs are:?
Not sure how useful this is in practice. Also, while several LLMs are available to use for free, will users opt for local LLMs that are a fraction of the size and ability over ones freely available due to privacy concerns is something we’ll have to wait and see.
Can GPT Optimize My Taxes?
TL;DR We made a GPT interface to an open-source US tax scenarios library, and it is at times pretty good, but asks a lot of the user.
In 2024, if we let the LLM be the UX, then building the app becomes less, and different, work. In some sense, the user will just bring their own scenario, and the app doesn’t need to anticipate it. These were the steps:
If you have a paid ChatGPT+ account you can try out Tax Driver right here. The privacy policy is linked from the GPT, and also here; the upshot is that the backing web service is set up as a pure calculator, and logs only which endpoint was called, and when it was called by OpenAI’s servers. Note that independently of ChatGPT+, you can also play with tenforty directly using the included Colab notebook.
General purpose agents are not yet here, I guess.
For the extra curious