登录查看更多内容

How 4X Speedup on Generative Video Model (FILM) Created Huge Cost Savings for Wombo

OctoAI (Acquired by NVIDIA)

Run, tune, and scale the models that power AI applications.

发布日期: 2023年4月20日

Generative AI is the hottest workload on the planet, but it’s also the most compute intensive, and therefore expensive to run. This puts startups building generative AI businesses in a tricky position. Not only must they deliver killer product experiences that grab attention and market share – they need to make the economics work too. To lower compute costs, generative AI models need to run faster and more efficiently on a more diverse set of hardware.?

WOMBO: an OctoML customer story

WOMBO makes popular mobile apps for content creation using generative AI. Their apps use ML models like stable diffusion to help people create fun videos and images to share online.

No alt text provided for this image — Think face-swapping, mixed with lip synching

Nearly 75 million people across more than 180 countries downloaded the app, making WOMBO one of the fastest-growing consumer apps in history. Like any generative AI startup, user growth translates to higher compute costs. With a fleet of GPUs nearing capacity, any model efficiency gains were a top priority.?

One production model is FILM, which predicts and generates intermediate frames between two existing frames in a video sequence. For premium WOMBO users, FILM generates a video clip showing their “transformation” into a celebrity or historical figure. The more frames you have in between the images, the better the final video, but the more costly and time consuming it becomes to generate. Optimizing the model across different hardware can help WOMBO better balance these user experiences (faster, higher quality video) and cost considerations.?

OctoML ran a series of experiments to optimize FILM on two different GPUs: NVIDIA A100 and A10G. We used the OctoML platform to compare a baseline version of FILM (TensorFlow) with several other optimized configurations.?

Results snapshot:

Cut model serving costs by 98% compared to baseline
3.9x speedup on FILM model over baseline configurations?
Reduced image-to-image interpolation (AKA transformation) time from 10.1 seconds to 2.6 seconds

领英推荐

As AI Rewrites the Rules, Are You Playing the Right…

NDTV 3 个月前

SAM 2: Segment Anything Model - A new open-source…

Clarifai 7 个月前

Stanford study finds major AI models significantly…

WorkWithAI.com 1 年前

Better speed makes for a nice user experience, but these efficiency gains also slashed the compute cost per 1,000 image interpolations from $11.95 to $.24. Supposing 10,000 clips are created each day, that’s the difference between annual model serving costs of $43,617.50 or $876. For WOMBO, FILM traffic doesn’t represent the majority of overall usership, but even still, these cost savings can be significant.?

With so many media, entertainment and gaming applications, it’s easy to see how lowering compound model costs can make FILM more accessible to more creators. The more efficiently it runs, the more you can do for the same cost or less.

Here’s an example:

Let’s say a documentarian has access to 1,000 hours of archival video, and wants to use FILM to restore and enhance the missing footage. Working with the standard model configurations, running on NVIDIA A100 in AWS, this could cost upwards of $66,195.40 (assuming 24fps).

Combining OctoML model optimizations and the ability to run on the lower cost A10G in AWS, this cost comes down to $1,382.40.

Check out the OctoML blog for the full results of our work with WOMBO on FILM.?

If you want to achieve better speeds and lower costs for your AI workloads, be one of the first to try the new OctoML Compute Service. We're building an efficient compute layer that’s as easy to use as OpenAI, but flexible to run with any model.

How 4X Speedup on Generative Video Model (FILM) Created Huge Cost Savings for Wombo

OctoAI (Acquired by NVIDIA)

Run, tune, and scale the models that power AI applications.

WOMBO: an OctoML customer story

Results snapshot:

领英推荐

OctoAI (Acquired by NVIDIA)的更多文章

社区洞察

其他会员也浏览了

What's in sight? The ImageVision.ai's Monthly Newsletter

Generational AI: Changing Technology and Patent Landscape

AI IN THE ARENA

How AI Agents Are Shaping the Future of Virtual Interactions

The Most Insane Week in AI

Last Week on AI - no. 42

Unleashing the Power of Tesla's Dojo: A Revolutionary Shift in AI Computing and its Potential to Redefine Business Strategies

Weekly AI News: Stable Diffusion 3 Outshines in Text-to-Video Generation, TripoSR Transforms 2D to 3D, and More

Agentic AI: Unlocking Enterprise Potential Like Never Before

World Models: The Next Frontier in Artificial Intelligence

WOMBO: an OctoML customer story

Results snapshot:

领英推荐

OctoAI (Acquired by NVIDIA)的更多文章

Build amazing eCommerce apps with new OctoAI Image Gen Solution

OctoAI is now GA ??

Making the Llama 2 Herd Work for You on OctoAI

OctoAI Now Provides Fastest Stable Diffusion XL Endpoint

OctoML Welcomes Tony Tzeng as Chief Product Officer

??Train Your Own Custom Stable Diffusion Model with Automatic 1111 on OctoAI

Build LLM Apps With Open Source Models Using OctoAI & LangChain

OctoAI Compute Service Launch Recap

Running the Industry’s Most Cost Effective LLaMA 65B on OctoAI

Introducing InkyMM: The First Commercial Open Source Multimodal Model

社区洞察

其他会员也浏览了

What's in sight? The ImageVision.ai's Monthly Newsletter

Generational AI: Changing Technology and Patent Landscape

AI IN THE ARENA

How AI Agents Are Shaping the Future of Virtual Interactions

The Most Insane Week in AI

Last Week on AI - no. 42

Unleashing the Power of Tesla's Dojo: A Revolutionary Shift in AI Computing and its Potential to Redefine Business Strategies

Weekly AI News: Stable Diffusion 3 Outshines in Text-to-Video Generation, TripoSR Transforms 2D to 3D, and More

Agentic AI: Unlocking Enterprise Potential Like Never Before

World Models: The Next Frontier in Artificial Intelligence