Section: IT & Technology · Cloud ComputingDifficulty: Medium

Auto Scaling

USUK

Automatically adjusting cloud compute resources up or down in response to demand or predefined rules.

Also: elasticity · auto-scale

Definition

Auto scaling is the ability of cloud platforms to automatically adjust the number of compute resources allocated to an application based on real-time demand, predefined metrics, or scheduled rules. Horizontal auto scaling (scaling out) adds or removes server instances, while vertical auto scaling (scaling up) changes instance sizes. Auto scaling ensures applications have sufficient resources to handle traffic spikes while minimizing costs during low-traffic periods. AWS Auto Scaling, Azure VMSS, and Kubernetes HPA are common implementations.

Example

A news website's auto scaling group grows from 5 to 50 servers within minutes when a breaking story causes a 10x traffic spike, then scales back down automatically afterward.

Synonyms

  • elastic scaling
  • dynamic scaling
  • automatic capacity adjustment

Antonyms / Opposites

  • fixed capacity
  • manual scaling
  • static provisioning

Images

CC-licensed · free to use
More on Wikimedia
Loading images…

Video

  • load-balancer
  • cloud-computing
  • kubernetes
  • horizontal-scaling

Dictionary Entry

Back to IT & Technology