Cloud applications are usually composed by a set of components (microservices) that may be located in different virtual and/or physical computers.
To achieve the desired level of performance, availability, scalability, and robustness in this kind of system is necessary to describe and maintain a complex set of infrastructure configurations.
Another approach would be to use a Distributed Virtualization System (DVS) that provides a transparent mechanism that each component could use to communicate with others, regardless of their location and thus, avoiding the potential problems and complexity added by their distributed execution. This communication mechanism already has useful features for developing distributed applications with replication support for high availability and performance requirements.
When a cluster of backend servers runs the same set of services for a lot of clients, it needs to present a single entry-point for them. In general, an application proxy is used to meet this requirement with auto-scaling and load balancing features added. Autoscaling is the mechanism that dynamically monitors the load of the cluster nodes and creates new server instances when the load is greater than the threshold of highest CPU usage or it removes server instances when the load is less than the threshold of lowest CPU usage. Load balancing is another related mechanism that distributes the load among server instances to avoid that some instances are saturated and others unloaded. Both mechanisms help to provide better performance and availability of critical services.
This article describes the design, implementation, and testing of a service proxy with auto-scaling and load balancing features in a DVS.