Essentialy, we divided Autoscalers into two types.
- Vertical - is one that adds on the fly computing resources (just RAM and CPU) and thus changing the instance type. Upscaling usually takes place without the need to restart the instance. However, downscaling requires a restart. When restarting the application service may not be available, but also we can plan downscaling only in specific hours. Technically, we can also disable the ability to recognize the resources in general, if you decide yourself when you return to the original configuration instance.
- Horizontal - is one that clones (creating new) instance in the case, if you can not have more resources CPU / RAM per instance of the source (for example, you can specify that the source instance can not consume more than 16 GB of memory). Its effect will be that of a single instance of our service will be created first two instances, then three and so on - until the set limit. What is the most important, is the fact that the horizontal Autoscaler works in conjunction with load balancing (spreading incoming traffic to specific instances). So, if our service works on at least two (or more) instances are essentially no restart or modify any of them will not affect the availability of the service. What's more, you can configure the load balancer mechanism, 10 seconds to check whether our application is up and running on all instances and if any of them will answer wrong, then the load balancer stops traffic directed to it.
Vertical Autoscaler responds no more than 40 minutes from the boot / restart / start the instance in increments of 20 minutes, or every 20 minutes can perform actions of adding resource instance, based on the reports of the use of memory and CPU (early response). Horizontal Autoscaler responds every 20 minutes.
In summary, a combination of both mechanisms, ensures that at any time your service does not cease to be available, regardless of whether the resources are added or subtracted.