Scalability
Horizontal vs vertical scaling, load balancing, and capacity planning
Scalability
Scalability is the ability of a system to handle growing amounts of load by adding resources. There are two main approaches: vertical scaling (scale up) and horizontal scaling (scale out).
Scaling Strategies
Vertical Scaling (Scale Up)
- Add more power to existing machine (CPU, RAM, SSD)
- Simpler — no code changes needed
- Limited by hardware ceiling
- Single point of failure
- Example: Upgrading database server from 16GB to 128GB RAM
Horizontal Scaling (Scale Out)
- Add more machines to the pool
- Requires load balancer to distribute traffic
- No theoretical ceiling
- Requires stateless application design
- More complex — need to handle data consistency
- Example: Going from 1 web server to 10 behind a load balancer
Load Balancing Strategies
text
Round Robin
→ Requests distributed sequentially to each server
→ Simple but doesn't account for server load
Weighted Round Robin
→ Servers with higher capacity get more requests
→ Better for heterogeneous server clusters
Least Connections
→ New request goes to server with fewest active connections
→ Best for long-lived connections (WebSockets)
IP Hash
→ Client IP determines which server handles request
→ Ensures session stickiness
Consistent Hashing
→ Minimizes redistribution when servers are added/removed
→ Used by CDNs and distributed cachestext
upstream backend {
# Round Robin (default)
server backend1.example.com;
server backend2.example.com;
server backend3.example.com;
# Weighted
server backend1.example.com weight=5;
server backend2.example.com weight=3;
# Least connections
least_conn;
# Health check
server backend1.example.com max_fails=3 fail_timeout=30s;
}
server {
listen 80;
location / {
proxy_pass http://backend;
}
}💬 When should you scale vertically vs horizontally?
Scale vertically when: you need quick improvements, application is stateful, or database needs more RAM/CPU. Scale horizontally when: you need high availability (no single point of failure), expect continued growth beyond hardware limits, or can design stateless services. Most real systems use both — scale up the database, scale out the application tier.