Skip to main content

Scaling Strategy

The platform automatically scales based on demand. HPA (Horizontal Pod Autoscaler) adjusts the number of pods based on CPU and memory usage.

Scaling Approach

Horizontal scaling (increasing pod count) is the primary method. Vertical scaling (increasing resources) is used for special cases.

HPA Configuration

MetricTargetMinMax
CPU70%210
Memory80%210
Requests/sec1000220

Scaling Scenarios

Normal Load (< 100 users)

ComponentReplicaCPUMemory
API2250m512Mi
Agent2500m1Gi
DB11000m2Gi

High Load (100-500 users)

ComponentReplicaCPUMemory
API5500m1Gi
Agent41000m2Gi
DB22000m4Gi

Peak Load (500+ users)

ComponentReplicaCPUMemory
API101000m2Gi
Agent82000m4Gi
DB3 (HA)4000m8Gi
Automatic Scaling

HPA automatically scales up when CPU exceeds 70%. Scale-down begins when it drops below 30%.

Database Scaling

StrategyDescription
Connection PoolingConnection pool with PgBouncer
Read ReplicasRead load distribution
PartitioningTable partitioning
ArchivingOld data archiving

Performance Targets

MetricTarget
Response Timep99 < 200ms
Throughput1000+ TPS
Availability99.9%
Error Rate< 0.1%