Video Editing with Jianying AI (Capcut China) - Build with AI (3)
Kitty (Sijia) Shen
Product Manager @ NHS | Strategic Marketing @ Imperial | MCIT (CS) @ Upenn | Ex-Apple, THG | UX, AI, customer-centric
Project Overview
Project Background
I often describe myself as a product manager by day and a content creator by night. Over the last few years, I've dedicated much of my free time to making videos, and I'm proud to say I've gained over 50,000 subscribers on platforms like RED and Bilibili. Creating a video involves a lot of steps - from brainstorming ideas to researching, writing scripts, filming, and editing, not to mention optimizing for SEO. Out of all these, I've found video editing to take up the most time. After getting the hang of the basic editing techniques, it started to feel like I was just doing the same thing over and over again.
That's when I noticed something exciting: the video editing tool I use, Jianying (known as Capcut outside of Mainland China and created by Tiktok's parent company, Bytedance), started adding new AI features. So, I decided to dive in and use these AI tools for my next video, aiming to figure out a few things:
With this project, I wasn't just looking to make my video editing easier. I also wanted to understand how AI can change the game from a product management perspective and learn more about the AI concepts behind these new features.
Through this journey, I'm looking to share:
Let's explore together how these AI advancements are reshaping the way we create videos, and what that means for content creators and product managers alike.
User Interface of Jianying
How I used Jianying
I tried 4 AI features on jianying:
Feature 1: AI-generated caption effect
Capcut introduces an AI-generated caption effect, allowing users to bring text to life by simply describing the desired effect. Currently offered for free, this feature hints at Capcut's strategy to encourage widespread adoption before it possibly becomes a premium service.
Users are presented with a default example, "melting chocolate," to illustrate the feature's potential. This approach makes it easier for users to understand and use the feature, aiming to boost its activation rate.
This initial free access is likely designed to increase the feature's use and familiarize users with its benefits, setting the stage for a potential future premium offering.
Jianying enhances the user experience further by offering an "inspiration library." This resource is filled with examples of caption effects, neatly organized into categories like trendy, vlog, tech, fashion, food, nature, and festival. This categorization helps users easily find inspiration and apply the most suitable effects to their videos, tailored to their content's theme.
Jianying enables users to tweak the AI-generated caption effects by altering the font style. However, the options are currently limited to 9 fonts, significantly fewer than the general selection offered within the app.
I experimented by creating 2 new caption effects on my own, and the results were very satisfying.
Similar to Midjourney, Jianying users can view the complete timeline of all the generated effects, allowing them to easily revert to previous versions if desired.
After generating an effect, Jianying users have the option to:
The feedback mechanism is specifically designed to report negative content, such as illegal, politically sensitive, biased, fake, or sexual information. This feature aligns with the principles of responsible AI, ensuring that the platform remains a safe and respectful environment for all users.
Feature review
While the AI-generated caption effect feature in Jianying is undoubtedly well-designed and fun to explore, its practicality for me is somewhat limited. My usage of caption effects has been minimal, indicating I might not be the primary audience for this innovation. Observing current trends, it appears that creators often opt for simplicity over elaborate customizations for captions, focusing their efforts elsewhere in their content. The most applicable scenario for such advanced effects seems to be in advertising, where emphasizing a product's features, like comparing a sheet's softness to a cloud, could benefit from a corresponding visual effect. However, it's worth noting that Jianying, much like its international counterpart Capcut, isn't primarily aimed at professional video producers, which may limit the utility of these advanced effects in commercial settings.
Feature2: AI-generated stickers
Expanding on the theme of creative AI tools, Jianying now allows users to craft their own stickers through a simple prompting system. This approach mirrors the AI-generated caption effect, offering a default prompt and an inspiration library alongside adjustable parameters.
The key distinction with the AI-generated sticker feature, compared to the caption effect, is that users are presented with 4 available options to choose from, echoing the selection process found in Midjourney.
Feature review:
Similar to the AI-generated caption effect, the AI-generated sticker feature is creatively designed but personally, I find its applicability limited. Stickers have not emerged as a significant trend on video-centric platforms like TikTok and YouTube, where the focus tends to be more on content than on embellishments. Instagram sees a relatively higher use of stickers, yet it remains to be seen if users would prefer custom stickers over the extensive collection already available on Instagram itself. The future of this feature likely hinges on how widely it's adopted by users.
领英推荐
Feature 3: AI special effects
Jianying introduces AI special effects for premium users, enhancing video clips with advanced visual enhancements. This feature, like its predecessors, offers a default effect, a variety of options, and an inspiration library for users to explore. A notable limitation is that it only supports video clips shorter than 10 seconds and, even for VIP users, is capped at 10 uses. These constraints suggest that the feature is not suited for lengthy videos or creating cinematic-style content. Instead, its ideal application appears to be in crafting short, engaging TikTok videos that highlight visual transformations or enhancements, particularly focusing on people's appearances.
Exploring the anime effect II, users have the freedom to choose from four uniquely generated versions, seamlessly applying their selected special effect across the entire video clip for a captivating anime-inspired transformation. Here's the original video clip for reference:
An intriguing aspect of this AI feature is its unique pricing model: AI points. VIP users are allotted 1200 points monthly, which expire at the end of each month. Generating a special effect like the one described consumes 480 points. Currently, there seems to be no option to buy extra points. This model clearly incentivizes users to maintain their Jianying subscription, driving conversions. However, at 480 points per use, it means subscribers can only utilize this special effect twice per month.
Translation:
AI Points are now live!
Feature 4: Digital human
The concept of the digital human has significantly evolved, and Capcut (Jianying) includes a feature that allows users to choose from a variety of digital personas. These options vary in gender, appearance, pose (standing or walking), and style. I selected a character named Naigai, depicted as an elegant lady, to test if she could narrate a paragraph in my video about a short message I received.
Feature review:
Testing the digital human feature with Naigai revealed a significant issue: her inability to adapt her tone to match the sentiment of the text. Despite narrating negative aspects, Naigai maintained a cheerful demeanor, creating a discordant and odd effect in the video. This suggests a limitation in the digital human's ability to interpret text sentiment. A more suitable approach might be to default to neutral facial expressions when the technology cannot accurately gauge the emotional context of the text.
In summary, the AI features in Jianying are well-designed but face two issues. First, the AI technology is still developing, lagging behind platforms like Heygen 5.0, where digital humans move more realistically. Second, there's a mismatch between Jianying's simple editing image and the complexity of these AI features, making their appeal to casual users unclear. However, Jianying's mobile version (Capcut) offers AI features more suited to everyday users, like turning text into videos and creating product ads. These features better match the platform's user-friendly focus, and I'll explore them further in comparison to Capcut's capabilities soon.
Technology behind Jianying (desktop) AI
The Digital Human feature in Jianying (desktop) notably caught my attention. While it wasn't a perfect fit for my project, its potential applications are vast and varied.
The blend of rapidly evolving computer graphics and AI advancements is revolutionizing how we interact with digital platforms, bringing human-like avatars to the forefront. These digital humans are being integrated into various sectors, serving as sales assistants, corporate trainers, and even social media influencers. Their deployment marks a significant shift in the business world, offering cost efficiency, customization, and scalability that human employees can't match. Digital employees work tirelessly, adhere strictly to company policies, and require no monetary compensation or breaks.
The impact of digital humans on revenue generation is already evident. For instance, Soul Machines, a leading autonomous animation software company, has successfully deployed over 50 digital avatars across the globe. Mark Sagar, the co-founder, shared a compelling case where a cosmetics brand utilized a digital sales assistant to boost customer engagement and product recommendations. This led to a remarkable increase in sales conversion rates, with website visitors four and a half times more likely to make a purchase compared to the period before the digital assistant's integration.
For those considering the implementation of digital humans, the Harvard Business Review offers a flowchart to assess whether such an innovation would enhance interaction outcomes.
Feature Recommendation
Pain point
Music match: For the persona of content creators, two main objectives of using Jianying are to edit videos as efficiently as possible and attract as much traffic with the video as possible. A key element of this is to add appropriate and engaging background music. Good background music can contribute to long watch duration, a key metric for a video performance. However, editors normally need to spend lots of time looking for music clips in a big library and sometimes cannot find the match ones. So it’d be good if the video editing tool is able to identify the sentiment of the video and automatically add background music to the video.
Potential solution
To address the pain point described, the video editing tool can introduce an "Auto-Music Match" feature. Here’s how it could work and be measured:
Feature Description: Auto-Music Match
The Auto-Music Match feature would analyze the video content in real-time to detect the sentiment and themes. Using AI algorithms, it would then automatically suggest a selection of background tracks that match the detected sentiment. Content creators could preview and apply these suggestions with a single click, significantly reducing the time spent on music selection.
Key Components:
Product Metrics to Measure Feature Performance
Resources