Documentation

Scaling & Performance Guide

WhiteLabelZoom is built on a WebRTC SFU architecture designed for horizontal scaling. Start with a single server for small teams and scale to thousands of concurrent users by adding media servers behind a load balancer.

Architecture Overview

WhiteLabelZoom uses a Selective Forwarding Unit (SFU) architecture. Unlike peer-to-peer mesh networks that collapse under load, an SFU receives each participant's media stream once and selectively forwards it to other participants. This drastically reduces the bandwidth and CPU required on each client device.

The SFU architecture provides a key advantage: horizontal scalability. Each media server handles a set of independent meeting rooms. When you need more capacity, add another media server. There is no shared state between servers for active media sessions, making scaling straightforward.

Capacity Planning

Choose a server size based on your expected concurrent user count. These are baseline recommendations — actual capacity depends on video resolution, participant count per meeting, and recording settings.

TierServer SpecsCapacityUse Case
Small4 CPU, 8 GB RAMUp to 50 concurrent usersSmall teams, individual departments, or development environments.
Medium8 CPU, 16 GB RAMUp to 200 concurrent usersMid-size organizations with multiple simultaneous meetings.
Large16 CPU, 32 GB RAMUp to 500 concurrent usersLarge organizations, webinars, and high-traffic deployments.
EnterpriseMulti-server cluster1,000+ concurrent usersEnterprise-scale deployments with horizontal scaling across multiple media servers.

Horizontal Scaling

Scaling WhiteLabelZoom horizontally is straightforward. Each media server operates independently, handling its own set of meeting rooms. When your current server approaches capacity, add another media server behind the load balancer.

New meetings are assigned to the least-loaded server automatically. Existing meetings are not migrated — they continue on their assigned server until they end. This architecture avoids the complexity of live session migration and ensures zero disruption to active meetings when new servers are added.

For enterprise deployments supporting 1,000+ concurrent users, deploy a cluster of media servers across multiple availability zones or data centers for both capacity and redundancy.

Load Balancing

Place a reverse proxy in front of your media servers to distribute incoming connections and provide a single entry point for your platform.

Nginx / HAProxy

Use Nginx or HAProxy as your load balancer. Both support WebSocket upgrades required for signaling and can distribute HTTP traffic across multiple application servers.

Sticky Sessions

Enable sticky sessions (session affinity) so that participants in the same meeting are routed to the same media server. This is essential for WebRTC connections that must persist for the duration of a meeting.

Health Checks

Configure health check endpoints to detect unhealthy media servers and remove them from the rotation automatically. The load balancer only routes new meetings to servers that pass health checks.

Monitoring

Track these key metrics to understand your platform's health and plan capacity additions before users experience degradation. We recommend Prometheus + Grafana for metrics collection and visualization.

CPU Usage

Monitor CPU utilization on media servers. Video transcoding and stream forwarding are CPU-intensive. Keep sustained usage below 80%.

Bandwidth

Track inbound and outbound network throughput. Each video stream consumes 1-4 Mbps depending on resolution. Ensure sufficient headroom for peak loads.

Active Connections

Count active WebRTC peer connections per media server. This is the primary indicator of per-server load and determines when to add capacity.

Meeting Count

Track the number of active meeting rooms. Each room runs independently, so this metric drives load distribution across servers.

Performance Optimization

Optimize video delivery and reduce resource consumption with these built-in capabilities.

Simulcast

Participants send multiple video quality layers (high, medium, low). The SFU forwards the appropriate layer to each receiver based on their bandwidth and viewport size, reducing unnecessary data transfer.

Adaptive Bitrate

The platform automatically adjusts video bitrate based on real-time network conditions. Participants on slower connections receive lower-bitrate streams without affecting the experience for others.

TURN Server Placement

Deploy TURN servers geographically close to your users to minimize relay latency. For global deployments, place TURN servers in each major region where participants connect from.

CDN for Static Assets

Serve your frontend assets — JavaScript bundles, CSS, images, and fonts — through a CDN like CloudFront, Cloudflare, or Fastly. This reduces load on your application servers and delivers faster page loads to users worldwide by serving assets from edge locations closest to each user.

Database Optimization

For large deployments, enable connection pooling to reduce database overhead from frequent connections. Add read replicas to offload read-heavy queries — meeting history, user lookups, recording metadata — from the primary database. This keeps write performance consistent as your platform scales.

Ready to Scale Your Video Platform?

Deploy your first instance, configure your environment, or explore pricing for your team size.