Blog: Archive

agsdix-fas fa-home

Blog: Home

agsdix-fas fa-pen-fancy

Blog: CEO's Corner

agsdix-fas fa-code

Blog: Tech Talk

Blog: Product Releases

Blog: Conversion

Optimizing Your Web Viewer for Large-Scale Deployments

by | Jun 24, 2020

Many of Snowbound’s customers are enterprises that serve thousands and tens of thousands of clients.  Many variables can affect the performance they achieve, including:

  • Size and type of documents
  • Number of simultaneous clients
  • Server configuration
  • Interaction with other applications
  • How the systems are provisioned and tuned

This article discusses tuning techniques for configurations that include multiple servers, a load balancer, VMware and Tomcat application server.


VirtualViewer is highly performant, but as with any web application, there is a point of scale at which a single application server instance will not be enough to satisfy concurrent user workloads. In this example, we will explore the performance gains and challenges involved with the use of a load balanced configuration for VirtualViewer.

Load balancing applications like VirtualViewer is a significantly complex task. There are many options to consider when deploying to production and not a single best solution. Variables such as the types of documents viewed and the specifications of the application server systems in the cluster will impact which strategy is best for you.

Using NGINX, as we do in this example, is a simple way to see significant performance gains with requests coming in concurrently. While it’s not the perfect solution, there is a clear positive benefit under testing with a minimal investment in time to deploy.

Test Environment


1x Load Balancer (

  • VMWare Virtual Machine
  • 2 vCPU
  • 2GB of RAM

4x Application Server (

  • VMWare Virtual Machine
  • 2 vCPU
  • 8GB of RAM
  • Tomcat 8 w/ VirtualViewer Java installed

VirtualViewer Configuration

The configuration of VirtualViewer has not been changed in any way after installation of the 4.8.2 evaluation version (available here).

Tomcat has been tuned to adjust the JVM maximum heap size up to 2GB, as the default 64MB is not sufficient for larger images. This is done by inserting the following into /usr/share/tomcat8/bin/


NOTE: This configuration and tuning was sufficient for the purposes of this document, however, proper tuning and understanding of these parameters are a must before deploying to production environments.

After successful installation on one server, the working system was cloned three times to make a total of four identical application servers.

NGINX Load Balancer Configuration

To configure NGINX on the load balancer virtual machine, the following is inserted into /etc/nginx/conf.d/virtualviewerhtml5.conf:

upstream virtualviewerhtml5 {

server {
listen 8080;

location / {
proxy_pass http://virtualviewerhtml5;

In the upstream context, we list the servers and ports that NGINX should pass requests to. Since we have four servers, we will list all four here. By default, NGINX will use a round-robin discipline to distribute the connections it receives to the backend servers defined in the upstreams section.

This is not the ideal method of load balancing for all situations, as not all types of requests take the same time or computing resource to service. NGINX does allow for the use of ip_hash and least_connected disciplines, which can be read about in the NGINX documentation:



JMeter was used to send requests to VirtualViewer at varying levels of concurrency. A separate physical system with 4 CPUs and low latency gigabit connectivity to the application server cluster was used to run the tests.

In all tests, the getImage API call was used against the VirtualViewerHTML5APIUserGuide.pdf document which is in the Sample-Files directory included with VirtualViewer. The time required for the client to receive the complete response from the server for a single call to the API is the Time per Request. This document is a 54 page PDF file with images, making it a somewhat substantial file to request concurrently.

In each test run, a set of concurrent requests ranging from 2 at a time to 32 at a time was repeated 250 times. The time required to complete all 250 loops of the various concurrent request barrages is the Total Time.


Sending requests to a single server directly (bypassing load balancer):

Sending requests to the load balancer with all 4 servers serving requests:


Comparing the Total Time of the different tests, we can see that the single server is able to maintain stable results up to 4 concurrent requests, thereafter the Total Time begins to increase significantly, whereas the load balanced cluster is able to maintain stable results up to the 16 concurrent requests test.

At a workload of 32 concurrent requests, the average response time per request is roughly 4 times faster on our load balanced cluster. Since we have 4 application servers, this is proof that the system is distributing load as intended.

Conversely, at lower concurrency, we can see that the load balancer is adding significant latency. Some latency is to be expected, as in situations where a single server could handle the load on its own, the load balancer is nothing more than a “middle-man” slowing the process down by being in the way.

The latency added in lower concurrency tests seems excessive, however the response time per request was still perfectly acceptable. As such, no additional testing of this observation was conducted. Bypassing the server we do not see that latency and at the 4 concurrent request point, we see the two configurations converge on average response time. Maybe we found a bug in NGINX?