登录查看更多内容

WebRTC Architectures

Balaji Lakshmanan

Entrepreneur | Multimedia Maverick | AI Workflow Architect

发布日期: 2021年3月10日

Introduction

With the advent of WebRTC, real-time communications for the web made huge strides. WebRTC is a Peer-to-Peer technology. The Industry adopted WebRTC in different types of architecture, thereby creating a thriving ecosystem of companies that brought audio and video to the forefront. Each architecture has its own pros and cons. It is often selected based on individual priorities. In this article we will discuss the popular architectures and Vast Conference Platform's unique place in this ecosystem.

WebRTC

What is WebRTC anyways?

WebRTC is a technology that provides a set of browser-based programming interface to enable voice/video and data. The main components of WebRTC are

Media Stream API – Provides access to media streams from various capture sources such as camera or microphone.
RTCPeerConnection – Provides methods to connect to remote peer, maintain and monitor the connection and close it once it is no longer needed.
RTCDataChannel – Provides a network channel which can be used for bidirectional peer-to-peer transfers of arbitrary data.

Problem Domain

WebRTC technology operates in multi-party audio/video communications problem domain. We define the dimensions as follows

a) Computing power of client devices.

b) Bandwidth usage of client devices.

c) Scale – Number of people who can participate in a single session as well as number of sessions.

In light of the problem domain and dimensions we review five popular architectures and evaluate them against the problem dimensions to pass a verdict.

Architectures

Peer-to-Peer Mesh

In a peer-to-peer mesh each client opens a connection to every other client to upload and download media stream. For a group of n clients communicating with each other, you will have (n-1) uploads and (n-1) downloads for each client.

Advantages

Simple.
Client can choose different encoding scheme tailored to remote peer. This is useful if remote client has hardware-based decoder.
Communication can be end to end encrypted.

Disadvantages

Does not scale for large sessions.

Problem Dimensions

Computing needs are very high as you scale the number of participants as there is one encoder/ decoder for each connection.
Bandwidth usage is very high as you scale the number of participants as you have individual uplink /downlink per client.
Scale – simply does not scale for larger sessions due to computing, bandwidth and connection management needs.

Verdict: Fails problem dimensions

a, b and c.

Selective Forwarding Unit (SFU)

In SFU architecture every client uploads single stream to SFU, which in turn forwards it to all the remote clients. In a n client scenario, this immediately cuts down the number of uploads per client to 1, keeping the downloads to (n-1).

Advantages

Single uplink connection.
Communication can be end to end encrypted as long as SFU does not inspect packets (signaling or data).

Disadvantages

Client cannot choose different encoding scheme tailored to remote peer.

SFU introduces the concept of Temporal Scalability in Codec Used as a partial solution (does not fix spatial scalability and quality issue).

Problem Dimensions

Computing needs are still very high on receiver end for decoding, though this solves the multiple encoder issue.
Bandwidth issue is now solved for uplink as typically ISP has asymmetric uplink speed compared to downlink speed and Uplink speed is typically very low compared to downlink (compare 1.5Mbps up/30Mbps down for example).
Scale – This system scales for larger group sessions, though multiple decoders pose rendering computing resource issues at some level.

Verdict: Does not solve a, partially solves b and c.

Simulcast

Simulcast is a variation within SFU architecture. It makes an assumption that some clients are gifted with superior computing /bandwidth powers than average. This allows for a single client to deploy multiple encoders/bandwidth profile for same upload stream.

Advantages

Multiple uplinks but not one per client like Peer-to-Peer.
Client can choose different encoding scheme tailored to types of remote peer.
Communication can be end to end encrypted.

Disadvantages

Higher computing/bandwidth on client than SFU.

Problem Dimensions

Computing needs are higher than SFU as you can see each client does more work to encode different profiles.
Bandwidth needs on uplink are higher than SFU due to multiple profile upload.
Scale -Same as SFU but caters better to individual client computing /bandwidth profiles due to multiple profile uploads.

Verdict: Does not solve Problem domain a, partially solves b and c.

Multipoint Control Unit (MCU)

MCU architecture shifts the computing needs to server side. In a MCU architecture every client uploads one stream and downloads one stream. As the number of participants n grows in a group session, each client device stays at one connection for uplink and downlink.

Advantages

Single uplink and downlink connection independent of session size.
Predictable bandwidth usage independent of session size.
Transcoding can be done on server side to cater to varying client profiles.

Disadvantages

End to end encryption breaks as MCU decodes and re-encodes the stream.

Problem Dimensions

Computing needs of clients are predictable low of single encoder/decoder.
Bandwidth needs of clients are predictable low of single uplink/downlink and bandwidth is capped and doesn’t increase with number of clients in session.
Scale – Though client devices don’t incur premium, the computing needs on server-side increase as session scales in size. MCU architectures have to solve the Video mixing scaling problem.

Verdict: Solves Problem domains a and b. Partially solves c.

Hybrid SFU-MCU

Hybrid SFU-MCU fronts the MCU with SFU. This way it retains the flexibility of serving individual clients who like direct feed as opposed to mixed feed from MCU.

Advantages

Retains the benefits of MCU architecture and enhances scalability.
This architecture also retains the flexibility for end client to subscribe to individual feed from instead of MCU feed.

Disadvantages

End to end encryption breaks as MCU decodes and re-encodes the stream.

Problem Dimensions

Computing needs of clients are predictable low of single encoder/decoder.
Bandwidth needs of clients are predictable low of single uplink/downlink and bandwidth is capped and doesn’t increase with number of clients in session.
Scale – Though client devices don’t incur premium, the computing needs on server-side increase as session scales in size. MCU architectures have to solve the Video mixing scaling problem.

Verdict: Solves Problem domains a and b. Partially solves c.

Introducing - Vast Conference Platform

All of the above 5 architecture fail to completely solve the scalability problem dimension (c). This is where vast conference platform seeks to add value.

Let’s take a deeper dive into problem dimension c (scale).

How do you scale the SFU and MCU blocks on server-side while solving video mixing scaling problem?
How can you scale the number of participants per session as session grow?
How can you scale the number of sessions per platform as quantity of sessions grows?

Vast Conference Platform Solution

Solving SFU scalability

Binary transfer via WebSocket to Edge with HTTP fallback.
Patent pending SMART Overlay network architecture with Server-less computing.
Onboard a client at closest edge and manage core network publish-subscribe architecture (on net) for transport to remote client edge.

Solving MCU scalability

Audio and Video bring different architectures in play for mixing.

Patented platform for Audio MCU. Solves number or participants per session scalability as well as number of sessions scalability.
Video mixing uses distributed MCU nodes, creates profiles and pages (resolution) to handle scalability. This allows for scaling of single session across multiple nodes as well as scale sessions with computing nodes.

Verdict: Solves problem dimensions a, b, c.

Vast conference's solutions involve distributed computing with scalability at its core.

We solve sophisticated computing problems to pass down the cost value benefit to our customers.

Questions? Get in touch

要查看或添加评论，请登录

Balaji Lakshmanan的更多文章

AI's Next Chapter: Will You Own Your Digital Future?

2025年3月12日

AI's Next Chapter: Will You Own Your Digital Future?

We live in an experiential world—a world where, if everyone had everything, the only thing left to pay for would be…
DeepSeek: AI Gets Deep, Cheesy, and Full of Reason!

2025年1月29日

DeepSeek: AI Gets Deep, Cheesy, and Full of Reason!

?? TL;DR: Non-reasoning models rely on SFT, while reasoning models can choose between Critic RL or GRPO. DeepSeek-R1…
Exploring the Depths of Large Language Models: A Journey into Perception and Reality

2024年1月31日

Exploring the Depths of Large Language Models: A Journey into Perception and Reality

The Philosophical Foundation of AI Models Let's visit realm of large language models, but first, a touch of philosophy,…

1 条评论
How to bring Painting to Life with AI

2023年2月9日

How to bring Painting to Life with AI

It is amazing what you can do with AI models these days with a gaming PC and NVIDIA card. Goal Wanted to take Raja Ravi…

3 条评论
Encryption – A shoppers Guide

2021年3月17日

Encryption – A shoppers Guide

With billions of connected devices on the internet and more getting added every day, it is important to have an…

See all articles

Introduction

WebRTC

Problem Domain

Architectures

Peer-to-Peer Mesh

Advantages

Disadvantages

Problem Dimensions

Selective Forwarding Unit (SFU)

Advantages

Disadvantages

Problem Dimensions

Simulcast

Advantages

Disadvantages

Problem Dimensions

Multipoint Control Unit (MCU)

Advantages

Disadvantages

Problem Dimensions

Hybrid SFU-MCU

Advantages

Disadvantages

Problem Dimensions

Introducing - Vast Conference Platform

Vast Conference Platform Solution

Solving SFU scalability

Solving MCU scalability

Balaji Lakshmanan的更多文章

AI's Next Chapter: Will You Own Your Digital Future?

DeepSeek: AI Gets Deep, Cheesy, and Full of Reason!

Exploring the Depths of Large Language Models: A Journey into Perception and Reality

How to bring Painting to Life with AI

Encryption – A shoppers Guide

社区洞察

其他会员也浏览了

Layers

The Evolution of Edge Computing in Web Engineering: Unlocking Speed and Scalability

Understanding Event-Driven Architecture

Event-Driven Architectures: Scalability and Responsiveness

Tale of Software Architect(ure): Part 5 (Wrong Assumption in Software Architecture and Fallacies of Distributed Computing)

Asynchronous architecture : Topic vs Queue vs Exchange

K8S Architecture

Two-Phase Commit(2PC)?-?Distributed Design?Patterns

Simplifying Conceptual Flow Interfaces Through DDD

Reflections on Multi-Agent Systems: Architecture, Implementation, and Best Practices