In our fifth of 5 Steps to Ensure Performance, Technical Solutions Director, Joel Deutscher discusses the importance of examining the high risk elements of your application delivery and testing to ensure they will work when you need them to.
Step 5 – Ensure Operational Readiness
In years past, performance focused purely on the application back. We wrestled red tape to get our load injectors as close as we could to the webservers as possible, like high frequency traders trying to get as close to the exchange as possible. We fought to get infrastructure that matched production almost identically to the production environment for our web, app and DB tiers, though we rarely worried about what was in front of them.
This approach is outdated and can potentially result in your application failing even with traditional performance testing. We saw a high profile example of this in the Australian Bureau of Statistics' Census.
So how should we approach performance testing? The answer is simple. We need to consider every element of our solution that we have any control over, including the infrastructure we run on and the load balancers and firewalls. This is especially important when utilising public cloud infrastructure.
“Wait a minute,” I hear you say. “Wasn’t the cloud supposed to remove our reliance on infrastructure? Can’t we just throw our application in AWS or Azure and let them worry about it?” Not surprisingly, this is actually a pretty common misconception. While there are some great features within the top cloud providers that can take away many of the headaches associated with scaling your applications, they come with their own set of problems.
Starting with the outermost component in many environments, the load balancer, we already have a number of issues we need to be aware of. Firstly, the Elastic Load Balancer (ELB) offered by Amazon scales with load. The ELB scales as the load increases and should be pre-warmed if you are expecting a sudden spike in traffic, to avoid spikes in response times as it scales. Further down the line, an auto-scaling group seems like the perfect solution for dealing with large amounts of traffic, though the parameters of when to scale up and when to scale down actually require more performance tuning and configuration than many applications.
We often separate operational resilience or disaster recovery testing from performance, though really, this is where the most risk lies. What happens if we fail in our busiest period? How much load can our application support when part of it goes down? These are all the types of questions that need to be asked to ensure you can deliver not only a scalable application, but one that doesn’t degrade terribly when something goes wrong.
A structured operational resilience strategy ensures that the high risk elements of your application delivery are considered and tested to ensure they work when you need them to. If you can’t avoid a major system failure if you lose a particular component, it needs to fail gracefully. I think we have all been frustrated by the virtual queue when trying to purchase concert tickets, though it is definitely a better experience than a browser error page.
Once you start to consider performance for your entire application and infrastructure, you will be better prepared for those situations that may be missed with a traditional blinkered view of performance testing, and ensure that your customer's experience is that of a well performing application and not of a browser's error page.
← Step 4 - Save Money with On-Demand Services