Scaling WebSocket Connections: Lessons Learned
Practical insights from scaling WebSocket infrastructure to handle 10K+ concurrent connections
WebSocket Scaling Redis Architecture
The Challenge
When our XR platform started growing, we hit the classic WebSocket scaling wall at around 5,000 concurrent connections. Here’s how we solved it.
Problem 1: Connection Limits
Each server instance has a limit on file descriptors. We hit this at ~4,000 connections per instance.
Solution: Increased ulimit settings and implemented horizontal scaling with a load balancer.
# /etc/security/limits.conf
* soft nofile 65535
* hard nofile 65535
Problem 2: Sticky Sessions
WebSocket connections need to stay on the same server, but traditional load balancing doesn’t account for this.
Solution: IP hash-based routing in our load balancer:
upstream websocket_servers {
ip_hash;
server ws1.example.com:8080;
server ws2.example.com:8080;
server ws3.example.com:8080;
}
Problem 3: Cross-Server Communication
Users on different servers needed to communicate with each other.
Solution: Redis Pub/Sub for message broadcasting:
// Publish message to all servers
await redis.PublishAsync("room:123", message);
// Each server subscribes and forwards to local clients
subscriber.Subscribe("room:*", (channel, message) => {
BroadcastToLocalClients(channel, message);
});
Results
After implementing these solutions:
- Scaled from 5K to 15K concurrent connections
- 99.9% uptime over 6 months
- Average latency reduced from 150ms to 45ms
Key Takeaways
- Plan for horizontal scaling from the start
- Use Redis Pub/Sub for cross-server messaging
- Monitor connection counts and memory usage closely