You’ve built a web application and started getting regular customers; and now you are ready with a full-fledged product. When your app takes off, thousands of visitors are using your app and at one point they are unable to use your app.

You had tested your app and it’s working fine. So what happened?

This is not a bug but a problem of scalability. Your cloud architecture is not designed to scale with increasing load.

Many startups usually focus more on features and less on scalability. Creating applications that are both resilient and scalable is an essential part of any application architecture. 

What is a Scalable Application?

Scalability refers to the ability of a system to give a reasonable performance under growing demands (This can be larger data-sets, higher request rates, the combination of size and velocity, etc). It should work well with 1 user or 1 million users and handles spikes in traffic automatically. By adding and removing the resources only when needed, scalable apps consume only the resources necessary to meet demand.

When talking about scalability in cloud computing, there are 2 ways to go about it – horizontal or vertical.

Vertical scaling (Scaling Up):

Scaling up or vertical scaling refers to resource maximization of a single unit to expand its ability to handle the increasing load. In hardware terms, this includes adding processing power and memory to the physical machine running the server. In software terms, scaling up may include optimizing algorithms and application code.

Horizontal scaling (Scaling Out):

Scaling out or horizontal scaling refers to resource increment by the addition of units to the app’s cloud architecture. This means adding more units of smaller capacity instead of adding a single unit of larger capacity. The requests for resources are then spread across multiple units thus reducing the excess load on a single machine.

Scalable Web Architecture:

Whether you are looking to scale horizontally or vertically, scalable web architecture will be the base you will need. The scalable architecture enables web applications to adapt according to the user demand and offer the best performance. 

All the components in a monolithic architecture are tightly coupled. So, a single failure in any of the several layers can cause the entire application to fail. On the other hand, scalable web architecture requires a loosely coupled structure of different modules interacting through lightweight mechanisms. 

A typical example of scalable web architecture is a one using the MERN stack. It offers vital benefits like scalability, high availability, loosely coupled services (fault-tolerant), and distributed computing. 

Scalability- A scalable web architecture like MERN can enable horizontal scalability. The distributed system can quickly expand or contract resource pools as per scaling needs by adding or removing nodes.

High Availability- Higher uptime and lower downtime are essential to any business. Especially for SaaS applications, an hour’s downtime can lead to massive revenue loss. A scalable web architecture needs consideration for redundancy of critical components and rapid disaster recovery even in a partial system failure to ensure higher availability. 

Fault-tolerant- There is no single point of failure, and the system should be able to run efficiently even if a component fails. 

A typical scalable web architecture will have four key layers,

  1. Web servers
  2. Database servers
  3. Load balancers
  4. Shared file Servers 

Each of these layers is scaled independently, with the database layer being the toughest to scale. Employing the master-slave replication approach is an excellent way to scale databases efficiently. Each master node is a powerful machine that can read/write data, whereas a slave node can only read data. At the same time, a load balancer will ensure the distribution of load across master nodes. 

For optimal performance, the caching hierarchy is important. Here, the application’s local cache can handle common tasks like bootstrapping, while application server caches can handle complex and big queries. 

Further, job queues are a great way to control the high CPU operations needed for complex data processing. Finally, a Content Distribution Network (CDN) can be leveraged for static public content, and the design system should be fault-tolerant. 

Similarly, when it comes to cloud-based scalable architectures, Amazon Web Services or Google Cloud Platform are the preferred choices. Not just the application scalability, but they also offer several service integration that enhances the functionality.

As mentioned earlier, a scalable application needs different layers to scale independently rather than as a stack of tightly coupled components.

Characteristics of the Scalable Application:

What does scalability look like? There are some areas where an app needs to excel to be considered scalable.


First and foremost, the application must operate well under stress with low latency. The speed of a website affects usage and user satisfaction, as well as search engine rankings, a factor that directly correlates to revenue and retention. As a result, creating a scalable web application architecture that is optimized for fast responses and low latency is key.

Availability and Reliability:

These are closely related and equally necessary. Scalable apps rarely if ever go down under stress. They need to reliably produce data upon request and not lose stored information.


The manageability of the cloud architecture equates to the scalability of operations: maintenance and updates. Things to consider for manageability are the ease of diagnosing and understanding problems when they occur, ease of making updates or modifications, and how simple the system is to operate. (i.e., does it routinely operate without failure or exceptions?)


Highly scalable applications don’t have to be unreasonably expensive to build, maintain, or scale. Planning for scalability during development allows the app to expand as demand increases without causing undue expenses.

Steps to build a scalable application based on increasing users from 1 to 1 million:

  • Initial Setup of Cloud Architecture: The start can be as simple as deploying an application in a box.
  • Create multiple hosts and choose the database: Choose the database compatible with your tech stack. It’s advisable to start with SQL if users are increasing and generating data.
  • Store database on cloud to ease the operations: When users increase to 100, Database deployment is the first thing which needs to be done. There are two general directions to deploy a database – The foremost option is to use a managed database service and the second step is to host your own database software on cloud.
  • Create multiple availability zones to improve availability: As per current architecture, you may face availability issues. If the host for your web app fails then it may go down. So you need another web instance in another Availability Zone where you will put the slave database to RDS.
  • Move static content to object-based storage for better performance: To improve performance and efficiency, you’ll need to add more read replicas to RDS. This will take load off the write master database. Furthermore, you can reduce the load from web servers by moving static content to tools such as Amazon CloudFront.
  • Setting up Auto Scaling to meet the varying demand automatically: Auto Scaling enables “just-in-time provisioning,” allowing users to scale infrastructure dynamically as load demands. It can launch or terminate EC2 instances automatically based on Spikes in Traffic. You pay only for the resources which are enough to handle the load.
  • Use Service Oriented Architecture(SOA) for better flexibility: To serve more than 1 million users you need to use Service Oriented Architecture(SOA) while designing large scale web applications. In SOA, we need to separate each component from the respective tiers and create separate services. The individual services can then be scaled independently. Web and application tiers will have different resource requirements and different services.


The decision about how to approach scaling should be made upfront because you never know when you are going to get popular! Also, crashing (or even just slow) pages leave your users unhappy and your app with a bad reputation. It ultimately affects your revenue.

Learning how to build scalable websites takes time, a lot of practice, and a good amount of money. For companies that have a need and struggling to get this done, hiring certified technical architects from V2STech is your best bet for such a project.

Source article: This article first appeared on blog.

Leave a Reply