????Baidu's Chatbot Becomes 'ChatGPT + SearchGPT', Ant Group Introduces Personal Assistant, and MiniMax’s Video Generator Takes on Sora
Weekly China AI News from September 2, 2024 to September 8, 2024
Hi, this is Tony! Welcome to this week’s issue of Recode China AI, a newsletter for China’s trending AI news and papers.
Three things to know
(By the way, I was invited to write a guest post on Michael Spencer ’s AI Supremacy about Chinese AI startups. I hope you would enjoy it)
Baidu Upgrades Chatbot App With New Search and Memory Features
What’s New: Baidu last week upgraded its mobile chatbot, ERNIE Bot App, now rebranded as Wenxiaoyan (文小言). Marketed as a “New Search” smart assistant, the chatbot combines multiple functions that allow users to search for anything from music to map navigations, chat with the camera on, remember users’ preference, and schedule daily news written by AI.
How It Works: The name “Wenxiaoyan” adds a playful twist to ERNIE Bot’s Chinese name, “Wenxin Yiyan” (文心一言). It also plays on the phrase “Ask Xiaoyan” (问小言), a nickname users had already coined for ERNIE Bot.
Wenxiaoyan introduces a range of new features that set it apart from other AI chatbots:
According to a Baidu executive, Wenxiaoyan has already surpassed ten million monthly active users, with 70% of its user base being young people.
Why It Matters: The launch of Wenxiaoyan reflects the rising interest in AI-driven search engines, which have the potential to reshape search behaviors, with platforms like Perplexity and SearchGPT gaining traction. In response, traditional search giants like Google and Baidu are increasingly incorporating AI-generated answers into their search results. Google’s generative AI feature, AI Overview, is reducing errors and expanding its user base, while Baidu recently reported that 18% of its search results now include AI-generated summary. Wenxiaoyan represents Baidu’s most ambitious step yet in incorporating AI and search.
Ant Group Unveils an AI Assistant to Order Coffee and Pay Bills
What’s New: Ant Group launched a new AI mobile app, Zhixiaobao (支小宝), at the 2024 INCLUSION · Conference on the Bund in Shanghai. The app, marketed as an “AI life assistant,” integrates seamlessly with the company’s digital life platform, Alipay, which already hosts over 4 million mini-programs and 8,000 digital services. Zhixiaobao is now available for download on iOS and Android devices.
Why It Matters: For nearly two decades, Alipay has been a super app in China catering to all aspects of life, from paying bills and taxes to booking travel and even managing marriage certificates.
Now with Zhixiaobao, Ant Group is betting on AI to redefine people’s everyday experiences. Whether it’s ordering food, booking a ride, or finding local entertainment options, users can simply tell Zhixiaobao what they want, and it gets the job done. This is a major shift from users having to follow tedious step-by-step guides or click through endless mini-programs. However, when it comes to more complex tasks, such as purchasing a product from Taobao, the app falls short.
How It Works: The app features three sections— Moment, Chat, and Agents:
At the INCLUSION Conference, Ant Group also introduced three additional AI-driven products, including an AI Agent dev platform, an AI healthcare manager, and Ant Bridge, which is an open platform leveraging AI models and financial to help insurance companies provide personalized customer responses in real-time.
MiniMax Joins the Text-to-Video Race with New Model
What’s New: MiniMax, a Chinese AI startup, launched a new video generation model, abab-video-1, accessible via its chatbot Hailuo AI. Users can now try text-to-video generation on its web site for free.
Although a latecomer to the trend that began with OpenAI’s Sora earlier this year, MiniMax is confident that its model stands out. CEO and Co-Founder Yan Junjie claims it “might be the best video generator in China.”
How it Works: MiniMax’s text-to-video tool is quite simple. Users input a prompt, and the model generates a video within 5 minutes.
According to MiniMax, the model excels in high compression, text-to-video alignment, and diverse style generation. It can produce videos at a resolution of 1280x720, 25 fps, and a length of 6 seconds, which resemble cinematic quality. This advantage comes from solving complex issues related to token compression and optimizing the model’s training to handle high-dynamic content, according to the company.
The model uses the DiT architecture, similar to Sora, but the company has not disclosed further technical details. However, the CEO mentioned an innovative architecture known as Linear Attention.
I tested Hailuo AI with a few prompts and reviewed some video clips generated and shared on X. My first impression is that the model has a great understanding of text prompts, regardless of their complexity, and accurately reflects them in the generated videos. Although it currently only produces 6-second videos, it manages to include most elements specified in the prompt. Another remarkable feature is the range of styles the model can generate, from cinematic effects to anime.
However, while Hailuo AI showcased some fascinating user-generated examples, I was unable to replicate their results using the same prompts. In fact, my generated videos were significantly inferior to their samples. I guess the free version may not be the most advanced model?
Weekly News Roundup
Trending Research
Senior Managing Director
1 个月Tony Peng Great post! You've raised some interesting points.