Verify BizTalk Scalability with Monitoring Platforms

Integrating distributed systems is an extremely challenging task, as system integrators, we aspire to build robust integration solutions that can withstand and gracefully serve the business realistic workloads with a suitable level of performance.

At the start of any integration project, we gather requirements and learn about the lay of the land, that is the organization’s IT landscape, while armed with integration principles and patterns, to come up with the fit for purpose architecture.

Integration Solution Sizing Challenges

With the architecture in place, we then meet with the business stakeholders to conduct sizing activities to find out the workloads of today, and also try to forecast the expected workloads in the future. 

However, there are several challenges in the sizing activities, for one, the real-world workloads may end up surpassing what was initially planned for, as it was merely based on estimations and assumptions . Another challenge would be that the initial architecture might evolve with time, rendering our initial sizing obsolete. Further, some organizations have no clue about future business workloads estimates, and even not properly tracking the current workloads for that matter.

The sizing activity outcome is used to plan for on-premises, or IaaS infrastructure topology and hardware resources specifications of the different components, it is also used to select the right subscription tiers for complementary PaaS subscription-based services, if any.

BizTalk Scalability and Solution Performance Tools

Generally, the integration solution should be able to gracefully scale-up/scale-out to keep up with increased workloads in a timely fashion, the solution scalability should not become a hindrance to the business new opportunities. This is partly why many organization nowadays are looking for cloud-based services that promise rapid elasticity.

BizTalk solutions are no different, each of the BizTalk Server nodes, and the SQL Server nodes can be scaled either by increasing the servers’ resources specifications (scaling-up), or by adding more servers horizontally in a form of cluster (scaling-out). The latter approach also provides a greater level of availability, in addition to the increased scalability.

Typically, after scaling your system, you will certainly need to inspect these different components’ performance levels, and conduct stress, and performance testing.

For BizTalk monitoring, Windows performance counters, and performance analysis of logs tool can be leveraged to check the BizTalk Agent different metrics, and for SQL Server performance monitoring, SQL Server profiler can be leveraged. Check Microsoft technical guide for BizTalk Performance Tools for a comprehensive set of tools required to evaluate BizTalk Server Solution performance.

Monitoring Platforms Role in Integration Solutions

When our solution is subjected to increased loads, it starts to experience performance issues, and the business starts to suffer, and without having a monitoring platform in place, this is usually when we get notified of performance issues, directly from our customer!

Sadly, the proud statement we initially echoed to the customer, of having the integration solution as a business enabler is slowly fading away, along with the team’s reputation.

Integration monitoring platforms become quite indispensable to maintain a healthy integration solution and conduct troubleshooting with ease. Renowned third-party commercial monitoring platforms such as AIMS were specifically designed to fill this gap.

These platforms not only provide an bird’s eye view on the integration system components health metrics, they also provide fine-grained metrics and notifications for the different systems components’ operations.

AIMS Proactive Monitoring

Proactive monitoring is powerful way of discovering such performance issues early enough, before the problem gets compounded, thus, minimizing the negative impact incidences on the business, and providing more time to find, plan for, and fix the root cause of these types of issues.

AIMS monitoring platform really shines when it comes to proactive monitoring capabilities along with AIMS intelligent anomaly detection, in addition to the traditional monitoring features for integration solutions.

AIMS continuously ingests the system metrics across the different tiers: Windows Server, SQL Server, and BizTalk Server, for an on-premises solution. It also leverages proprietary machine learning algorithms at its back-end engine that can detect deviations from normal metrics patterns at different granularity levels.

AIMS monitoring granularity starts from a coarse-grained overall system health, for example: BizTalk overall health metrics dashboard, and the daily overall application performance index (Apdex).

The monitoring granularity goes all the way down to fine-grained component-level metrics, such as: a BizTalk SQL receive location message volume, SQL stored procedure CPU time.

Can Monitoring Platforms Verify BizTalk Scalability?

Monitoring platforms are already capturing an intensive set metrics across different components, so I wondered if they can also be used to verify that the newly scaled-up environment has indeed handled the increased workloads appropriately, and with continuous monitoring, it can also keep an eye for any other bottlenecks that may be introduced in other components, for instance: SQL Server, or BizTalk Message Box database. 

Testing BizTalk Scalability Metrics with AIMS

I was very much eager to test this using AIMS, to see how it will show the new metrics when I scale-up the virtual machine (VM) CPU and Memory resources, without changing anything in the solution.

I started with a VM that had a simple SQL polling BizTalk solution simulating incoming business transactions, with focus on message throughput as a variable, and VM Server’s CPU and Memory as the Scalability variables.

Before VM Scale-up

For the first 2 weeks, I started with an average message throughput of (50 messages/1 minute), on a VM with the following CPU and Memory specifications [Memory: 2GB, CPU: 1 Core].

In the BizTalk solution, I randomized the message throughput during different days and different times of the day within the above-stated message throughput average, so that it is a bit more realistic.

AIMS started ingesting the metrics from the solution under the initial loads, and after the first 2 weeks it started analyzing the solution weekly patterns based these ingested metrics, up to that point.

The system was apparently under very much stress, the BizTalk solution started to encounter performance issues, BizTalk throttling kicked-in.

These were materialized in AIMS dashboards in terms of throttling, and the overall health indicator of the BizTalk Group, represented as Apdex (0.44), as shown below.

After VM Scale-up

For the following few weeks, with the same average message throughput of (50 messages/1 minute), however, I just scaled-up the VM CPU and Memory specifications to reach the following [Memory: 8GB, CPU: 4 Core].

A quick look into the new captured metrics in AIMS dashboards, I noticed that a major improvement between the previous week metrics (shown as the Solid Blue Area), against the current week metrics (shown as the Line). 

Also, Delivery Throttling, and Message Delay went down, while Message Count has remained in the same range, as this is based on my sample solution test incoming rate.

The overall the BizTalk Group health index has dramatically improved, reaching (0.73) as shown below. It seemed that the system stress subsided and the environment health was indeed recovering.

Additionally, you can check the before & after metrics at a more granular level & at different processes across Windows Server, SQL Server, and BizTalk Server components.

Importantly, AIMS platform will start to re-learn the new metrics based on the new scaled-up environment, and further anomalies will be reported for the team’s assessment before conducting the appropriate action.

A Step Closer to a Reliable Solution

In this article, I conducted a test that simply scaled-up BizTalk Server VM in terms of Memory and CPU and witnessed how AIMS monitoring platform accordingly captured and verified the system’s performance levels.

BizTalk Server performance traditional tool-set and conducting performance testings are still crucial, however, leveraging a monitoring platform such as AIMS, could bring further confidence that can assure that the scaling approach was indeed the right technique, that the right component was scaled, and importantly, that there are no newly introduced bottlenecks in any other system component. 

Moreover, AIMS will also provide detailed metrics before and after scaling-up the system at the different granular levels, which can provide you with deeper insights into the components’ behavior, a step closer to achieving a reliable integration solution.