Microservices
Testing

Deployment strategies for testing microservices

According to the development and pre-production tests, your microservices perform as they should. However, keeping in mind that a comprehensive testing strategy includes tests at every step in the application’s life cycle, it is equally important to have strong deployment strategies. Development teams that test during production feel more confident releasing their software.

A deployment strategy determines how you choose to test and release your microservices software. A thorough approach to deployment contributes to a smooth rollout. The result is a reliable application that end-users can benefit from.

This article covers five deployment strategies that leave little room for problems when releasing new features or changes in your microservices application.

Why do you need to test in production?

A staging environment is a necessary but insufficient step toward building a robust microservices infrastructure. Although indicative, it is not a substitute for user traffic and behavior. When real users interact with the software during production, the services may not perform as efficiently.

Users are unpredictable. They may make requests you have not accounted for when drawing up communication paths. Or, your microservices architecture may be unable to bear the actual production traffic. When testing is done with real end-users, you are sure of your software’s performance. Only when you are aware of what works and what doesn’t can you improve it or add new functionality. A deployment strategy helps you release new services or changes in a reliable and structured manner.

The disadvantage of working with real users is that they will be impacted in the case of any errors. However, specific deployment strategies help you identify the issues and minimize their impact. For end-users, this generates trust and increases satisfaction.

Testing in production is essential for continuous delivery as it reduces downtime risks and provides faster, more secure deployments. Let us see how.

Strategies to consider

Depending on your goals, you can implement the following testing strategies in the deployment stage. They can be performed individually or in various combinations, as each approach varies in its contribution to deployment.

Blue-green deployment

A blue-green deployment instills confidence in developers before a full release of new services or features. It uses two distinct yet identical production environments simultaneously. You deploy an inactive version of your application in, for example, the blue environment. At the same time, the latest version available for use to end-users runs in the green environment. Next, you push code to add new functionality to the blue environment.

At this stage, you must test the services in the blue environment to ensure that they are running smoothly. You can automate this process by using basic smoke tests. When the microservices pass these tests, you can instruct the load balancer to gradually redirect user traffic to this new environment.

If the new functionality poses no problems, the router moves all of the traffic from the green environment to the one with the changes. In case of any errors, you can immediately direct the router to point traffic back to the old instance. In both cases, owing to the identical nature of both environments and their reliance on each other, there is little to no downtime. The end user's experience is largely unaffected, and the developers can quickly move to fix any bugs.

During and after the process of migrating from the old version to the new, the old environment serves as a backup. This makes it possible to execute a rollback without any downtime. After releasing the new version, you can decide to kill the old instances. You should do this after some time has passed and you are confident that the new environment will continue running without errors. You can also clone both blue and green environments to repeat the process with other feature deployments in the future.

Although a solid infrastructure to maintain separate identical environments comes at a cost, you need not design your microservice application to simultaneously run multiple versions of the same process. With a lightweight infrastructure, your services will run efficiently and faster. Blue-green testing also minimizes the risks of pushing a new feature or service into production.

Fundamentally, a blue-green deployment is concerned with straightforward switching between two environments. Keep in mind the following considerations to facilitate the transition:

  • Both production environments must be nearly identical to guarantee that switching from one to the other is a seamless experience for the end-user.
  • The two environments can be deployed in several ways. They may run on different physical machines, virtual machines, or containers. They can also be deployed in separate “zones” with different IP addresses on the same machine. Ideally, however, they must be isolated and use different resources to prevent an outage in one from spilling over into the other. Avoid linking them in any manner, or be especially careful if you do so.
  • Any running sessions or transactions between the users and the microservices must be copied to the new environment. New functionality is of little use if it does not build upon existing requests and responses. You can also temporarily feed transactions to both environments to guarantee smooth rollbacks.
  • Relatedly, your database schema is the foundation of your application. Regardless of whether the new features require that it be changed, it must be successfully migrated to the new production environment before you deploy the application. Failure to do this will result in issues that will be difficult to resolve at a later stage.
  • Finally, your application will need special attention if it uses a combination of monolithic and microservices-based architecture. A blue-green deployment is not designed to test traditional services and can result in significant downtime.

Canary deployment

Canary deployment is similar to blue-green testing in that it protects against release-related risks to a large extent.

The phrase “canary deployment” originates from a traditional mining technique. Miners used to put caged birds into mines to detect the presence of harmful gasses. The birds, which are more sensitive to these gasses, often fell ill or died. As a result, miners inferred the presence of poisonous gasses.

A canary deployment works on a similar principle. Instead of creating two separate environments, a canary deployment operates within the same microservice or infrastructure. The developers roll out a new service or application version with changes only to a fraction of the end-users.

You can quickly spot any errors or vulnerabilities while the new service is being used. The impact is temporary and minimal as all traffic has not been directed to this application version. Little is lost by way of time and money as the experiment is run on a subset of the audience. Most users, too, remain unaffected. Developers can experiment and fix bugs before publicly releasing the service or feature. Quicker feedback and recovery times facilitate lesser downtime too. Once they are working and have passed verification, you can scale up the changes. Incremental releases can, however, lead to slow application deployment overall.

A canary deployment allows your application to experience real requests. That being said, the number of users for whom changes are made available must be large enough to allow easy identification of issues. Robust observability and monitoring mechanisms also contribute to spotting and quickly reporting errors for faster rollbacks and recovery.

The biggest challenge for those running canary deployments is the requirement to run multiple versions of the same microservices so that they can be deployed to a few users at a time. Implementing multiple versions also means you need to keep track of which users are on which versions so as to run accurate business metrics and analytics. Such monitoring can get increasingly complex with multiple canary deployments within the same application.

Feature flags

Feature flags do not comprise a stand-alone deployment strategy but incorporate conditions into your application that render it ripe for experimentation while being used. A feature flag or feature toggle is a change or feature written inside conditional code. Developers can turn the feature on or off depending on the testing requirements while the application is running and in use.

The code is already deployed, and there are two code paths: one with the code that implements the feature and one without it. The developer only needs to choose one of the two paths. When the switch is toggled on, the code chunk is executed as part of the flow. When it is turned off, that code is skipped, and the feature is not implemented. The rest of the source code continues to run as usual, independent of the feature flag’s condition. There are no disturbances in the end-user’s experience of making requests to the microservice.

Once integrated, a feature flag allows you to turn on a feature for a select group of users. Unlike a canary deployment where the selection is random for the sake of testing, feature toggles are usually employed in specific cases. For example, developers do not create two separate applications to implement free and paid subscription tiers. Instead, operating a feature flag allows you to make a certain feature accessible only to users who pay a monthly fee.

Feature flags are especially attractive during testing as developers can use them to run small experiments throughout the application without relinquishing control. Being able to flip the switch remotely also makes rollbacks easier. The risk is negligible as the application is self-sufficient in that it can continue to run without the feature. Services like LaunchDarkly and Optimizely facilitate the process of incorporating feature flags in your code. The flags can be as simple or complex, based on your needs.

Although having multiple code paths has its advantages, it can also make your code heavy. Once you have decided on a feature’s fate, it is good practice to delete the code path you will not be using. Otherwise, your application will be replete with code chunks that will never be utilized.

With a canary or blue-green deployment, you know which version of the code the users are interacting with. Feature flags generate complex user-to-service paths by way of permutations and combinations. For a specific user, it is difficult to determine the exact path they took when using the services. Feature flags complicate this by allowing each user to have a different experience, ultimately making the application hard to debug. These toggles are easy to get started with and do not have high risks associated with them, but their maintenance can quickly become complex. It is best to use them only when needed and in conjunction with other deployment strategies.

Traffic shadowing

In traffic shadowing or mirroring, the router duplicates incoming traffic to an already-released service and gives the copy to another service. The request and response mechanism between the user and the existing service remains intact. On the other hand, the second service with a copy of the traffic contains new features that require testing. Consequently, it does not interfere with the existing process. Instead, the copy is used to test its functionality.

Its most significant benefit is that it allows the new version to receive the same traffic as is currently being received by the service it seeks to replace. There is no need to create test data or worry about replicating scale. This accuracy comes with little risk, as there is no tangible impact on the existing services. Developers can run all relevant tests on this production environment, such as testing for errors and performance metrics. The new version’s responses, which are not sent to users, can also be compared to those of the production service. Both versions operate independently of each other with separate end goals. This can happen in real-time, or a copy can be saved and replayed for testing in the future.

You can use traffic shadowing in conjunction with other deployment techniques like blue-green or canary deployments. After shadowing a production environment is successful, the changes can be rolled out gradually using a canary deployment to gain maximum confidence before a full release.

Shadowing can have unintended consequences, too, so you should exercise caution when deploying this strategy with services that have third-party dependencies. It is also costly, as like with blue-green deployments, you must run two production environments simultaneously.

A/B testing

Unlike blue-green and canary deployments, A/B testing is focused on user perception and experience of new features. It measures if and how end-users are interacting with these features, whether they are easy to notice and use, and the overall functionality of the application. It provides developer teams with business-level insight to improve their application as the features are, ultimately, implemented using code.

A/B testing divides the users into groups that access different features. Group A, for example, sees a different user interface than Group B, although members in both groups are making the same requests to access the application. Traffic is routed to separate builds or different configurations on a common build. This is done by taking into account aspects like the users’ operating systems and user agents. It must be run on a sample that is representative of your end-users. The test also needs to display statistically significant results to be valid.

This test can be combined with blue-green or canary deployments as they handle the actual feature deployments that this strategy tests. After comparing the versions shown to the groups, the one that has performed better can be pushed to release for all users. Here too, it is important to monitor user behavior.

Building your Deployment Strategy

Each strategy outlined here approaches testing differently. They can be used in isolation or together in a combination that best suits your goals, workflow, and the requirements of the microservices. They allow you to identify and reduce the impact of any vulnerabilities that did not surface until the final stage of your software’s release. Implementing them can be a fairly complex process, especially with larger architectures and dependencies. Use Cortex’s Scorecards to visualize and keep track of various deployments and production environments. Doing so will help your team push the best possible version of your application.