Container Auto-Scaling Configuration.
Scaling containers on Cycle is a straight forward process in two parts:
- Container Auto-Scaling
Scaled container instances will have an icon next to them showing they've been created from a scaling event. In the example below, containers instances are being stressed through api endpoints that cause either RAM or CPU usage, with thresholds set on both resources.
Container auto-scaling is a straightforward set of fields that outline what should happen during a scaling event.
Auto-scale Group.
This field is set to join a container to a configured auto-scale group.
Window.
The window field represents the baseline time-frame for making scaling decisions.
- minimum - 2m
- maximum - 30m
Example: 5m
A 5 minute window means that each time scaling is evaluated, it is evaluated against the average of the previous 5 minutes.
This also means its unlikely that any container will scale between its start time and the configured window time.
Network vs Resource Auto-Scaling
For network based auto-scaling, the minimum is 10 minutes.
Instances.
The instances section covers the minimums, maximums, change vectors, and TTL's.
| Field | Description |
|---|---|
| Min | The minimum number of instances this container should maintain at any given time. |
| Max | The maximum number of instances this container can have. A hard ceiling that doesn't lift with additional infrastructure. This is in addition to the initial instances. Example: If container is created with 2 instance and max is set to 10 then total max is 12. |
| Delta | The number of containers that should be created as a result of each scaling event. |
| Max Instances Per Server | The maximum number of instances that can be placed on any given qualifying server. This is a hard limit. |
| Minimum Time to Live | The minimum amount of time a scaled instance can exist before being subject to a scale down event. |
Thresholds.
Thresholds are the settings from which the scaling event decision is derived.
Example CPU Utilization 50%
If any instance of the container is, on average (defined by the average over the time of the window), using more than 50% of the alloted CPU available to it... A scaling event will occur.
CPU Threshold
CPU threshold is measured by utilization. Utilization is measured by taking current CPU limit / current CPU usage. A display of this value for each instance is shown on the individual instances' dashboard as a percentage.
RAM Threshold
RAM threshold is measured by actual usage of RAM. The threshold can be set based on the amount of RAM used, but the amount used when calculating the need for a scale event is still the average over the "window'd" period.
Network Requests
The network requests threshold is triggered when the total number of requests received by the instance exceeds the configured value, averaged over the window.
Network Throughput
The network throughput threshold is based on how much traffic is received (as measured in bits per second) over a given veth interface. There is a private checkbox that when checked will measure traffic over eth-priv, unchecked measures traffic over eth-pub.
Please note that all traffic coming from the load balancer to the instance will come to the container over eth-priv and all egress traffic from the container goes over eth-pub.
Network Connections
The network connections threshold is triggered when the total number of unique connections to the instance exceeds the configured value, averaged over the window.
Custom
For users that prefer maximum flexibility, the custom threshold is available. Using this threshold the user adds a webhook URL and the platform will make a GET request to that endpoint. To calculate the cadence of the request take the current auto-scaling window and divide it in half.
The platform expects a response from the request in the format:
{
"instances": 5
}Where 5 can be any integer and is used to determine how many instances the container should have.
Minimum TTL
With great power over the instances comes some responsibility. If the user responds to a webhook request with 10, and then 5 minutes later with 2 the containers that were created still have to live through their minimum TTL before being removed.
Custom Webhook Request Body
When constructing the web service that will handle the webhook the user can expect the following query data to be sent along with the request:
- hub_id
- cluster
- environment_id
- container_id
No telemetry data is sent with the webhook.