A Virtual Machine Scale Set lets us easily create and manage multiple virtual machines. The Azure Virtual Machine Scale Set allows you to create and manage a group of Virtual Machines. The number of VM instances can automatically increase or decrease based on scheduled conditions. Scale sets give high accessibility to your applications and permit you to manage, configure and update a set of VMs. There will be no public IP address assigned to the virtual machine that will be deployed as part of the virtual machine scale set.
Important points of Virtual Machine Scale Set (VMSS)
A Virtual Machine Scale Set lets us easily create and manage multiple virtual machines.
All VM instances are created from the same base OS image and configuration.
Scale sets are used to run multiple instances of your application.
It helps you to easily manage multiple VMs without extra configuration tasks.
It also minimizes the number of unnecessary VM instances that run your application when demand is low.
Scale sets support up to 1,000 VM instances for standard marketplace images and custom images.
Autoscale enables us to create or remove resources automatically based on the load on the services. The main point is that you can now design an architecture that will automatically scale up or scale down the resources. We need to configure it as a time-based auto-scaling or metric-based. We define the rule and actions that need to be triggered when the condition in that rule is matched.
Scaling can be categorised into 2 types:
Horizontal Scaling: Horizontal Scaling is a high availability of services for scaling out and in, in other words adding or removing instances of a resource.
Vertical Scaling: When we add new resources to the existing system to meet the requirement, it is known as vertical scaling. Vertical scaling is based on adding more CPU, RAM etc to existing systems, basically adding more resources.
Difference between Horizontal and Vertical Scaling
When new servers are added to the existing system, it is known as horizontal scaling.
When new resources are added to the existing system, it is known as vertical scaling.
It is easier to upgrade.
It is harder to upgrade and may involve downtime.
It is difficult to implement.
It is easy to implement.
It is expensive, as we add a new server that contains a ton of resources.
It is cheaper as we need to just add new resources.
It takes more time to be done.
It takes less time to be done.
Metrics for Autoscaling
Compute metrics: The available metrics will depend upon the installed operating system. Web Apps metrics: It includes CPU & memory percentage, Disk & HTTP queue length, and bytes received/sent. Storage/Service bus metrics: You can scale by Storage queue length, which is the number of messages in the storage queue. Storage queue length is a special metric and the threshold is the number of messages per instance.
You can create custom autoscaling metrics based on your situation. Rule types include:
Minimum Instance: The minimum number of instances you want to deploy in your scale set. Maximum Instances: The maximum number of instances you want to deploy while scaling out. Metric-based: It measures application load and adds or removes VMs based on that load. Time-based: For example, trigger an instance every 10 am PT on Monday
A scale set has a “scale set model” that defines the attributes of virtual machine instances (size, number of data disks, etc). As the number of instances in the scale set changes, new instances are added based on the scale set model. Uniform: optimized for large scale stateless workloads with identical instances Flexible: achieve high availability at scale with identical or multiple virtual machine types