Skip to main content

Scaling Applications

Applications experience varying levels of load and traffic. Sometimes, requests can become so numerous that an application may struggle to respond, resulting in errors for users. Scalability is the solution to prevent these disruptions. It involves increasing resources or instances of similar applications and distributing the load among them.

Additionally, temporary traffic growth may occur only during specific hours of the day. In such cases, manual scalability can be time-consuming and challenging. Conversely, allocating excessive resources to an application during low load periods can incur unnecessary costs. Automatic scalability in ArvanCloud Container provides a solution to these challenges.

Generally, there are two types of scalability in infrastructure:

  • Vertical: Vertical scaling allows you to adjust application resources (RAM / CPU / Ephemeral Storage).
  • Horizontal: Horizontal scaling enables you to adjust the number of application instances.

Vertical Scaling

If you need to increase or decrease resources such as RAM, CPU, or Ephemeral Storage for an application, you can click on the application name in the user panel and go to the "Settings" section.

Vertical Scaling

On this page, under "Vertical Scaling", you can specify the required resources and click on the apply button at the end.

Note that this process does not cause any interruptions in the application's operation.

Horizontal Scaling

With Horizontal Scaling in ArvanCloud Container, which is equivalent to HPA in Kubernetes, you can instantly increase the number of application containers to automatically distribute load among different pods using load balancing scenarios. Horizontal Scaling allows you to create multiple instances of your application with just a few clicks, both manually and automatically.

Please note that this feature is only applicable to stateless applications. If your application has Persistent Storage enabled, you cannot use manual or automatic scaling.

To change Horizontal Scaling, click on the application name and go to the "Scaling" section.

Manual Scaling

With manual scaling, you can create a specified number of identical application instances. Load balancing will automatically be enabled on them, ensuring roughly equal load distribution among pods. This capability can be defined and changed at any time and is recommended for users who typically experience consistent loads on their applications.

Manual Scaling

Note that reducing the number of pods to zero will deactivate the application and make it unavailable.

Automatic Scaling

With automatic scaling, as CPU usage increases on the application, similar pods are automatically created and the load is distributed among them. Essentially, you specify how many pods should be available during peak traffic times. For example, if CPU consumption reaches a specified level (and the maximum number of pods defined is more than 2), an additional pod will be added. This process continues until the pod limit is reached.

Automatic Scaling

First, specify the maximum CPU usage threshold. This value determines when pods are created and load balancing is activated based on CPU usage. Then, by defining the number of pods, you can specify how many instances of the application will be created.

After selecting the apply option, similar instances of the application will be created automatically as the CPU load increases to the specified extent, distributing the load among them to prevent performance issues due to resource shortages.