BizTalk Cloud Migration – Part 2

In the first part, I set the stage for a migration project that was requested by one of our customers to modernize an on-premises BizTalk solution using Azure Integration Services.

I mentioned the main drivers behind the customer’s decision to move to Azure, and some of the non-functional requirements for the integration solution.

I also described the target architecture and the proposed intermediate stages that would allow the integration solution to evolve gradually and reliably towards the target cloud-native architecture.

In this second and final part, we will explore this Azure Integration Services black box, or I should say: that “blue box”!

I will focus on the key design decisions for this integration solution: Azure resources selection, Security, and High availability.

Hybrid Architecture

The architecture serves a real-time, hybrid integration solution which will initially span the on-premises BizTalk Server and Azure Integration Services. The integration solution targets the decoupling of the integrated systems and promoting business services reuse.

This hybrid model paves the way for the organization’s progressive cloud solutions adoption. Eventually the organization is targeting using cloud-native solutions and on-premises integration platform and the other LOB systems will be decommissioned.

In Azure landscape, the solution components are grouped in a Resource Group which acts as a logical container for the solution resources. The Resource Group also represents a given environment, such as: DEV, UAT, PROD environments, and it lends itself well to providing an aggregated view of Azure billing information for the solution resources.

Logic Apps will achieve hybrid integrations with the organization’s on-premises systems using mainly on-premises data gateway and secured internet-facing services for some systems.

On-premises data gateway will streamline integrations between Logic Apps and the supported data sources such as on-premises BizTalk and selected LOB systems. It acts a secure bridge between Azure and the on-premises data sources without requiring additional public service proxies, mandating specific service framework, or requiring intrusive changes to the on-premises firewall.

In addition, the organization will provide a secured internet-facing services for some systems that are not provided through the data gateway data sources or through BizTalk Server.

The monitoring of the solution will leverage Azure Application Insights. Azure Log analytics will be leveraged for natively integrated Azure resources operational monitoring and diagnostics to view and analyze the Azure resources’ logs through Azure portal.

Azure Integration Resources

Azure Integration Services technologies are a set of PaaS/Serverless offerings which comprises the cloud-hosted integration layer for the hybrid model. The below diagram shows the main role for each resource.

1. API Management

Azure API Management (APIM) is a scalable and highly available PaaS. It will provide an internet-facing API gateway at the solution edge. APIM will act as the service controller by matching incoming operations requests to underlying implementation, in this case, the designated Logic App serving the target integrations and business process automation.

APIM will wrap and protect the underlying workloads, further, it enforces the organization’s compliance and governance policies for the published APIs, promotes service discoverability, and provides a platform for API lifecycle management across the organization.

Moreover, APIM will handle securing the published APIs and the client’s request/response format transformations to/from the internal Logic Apps canonical message formats, using APIM Liquid Templates in the inbound and outbound APIM policy execution stages.

Importantly, APIM will provide an API facade that is hiding the underlying implementation of the integration layer, thus allowing for these workloads to change, behind the scenes, as required.

In case the organization already has a registered domain and requires to maintain it, you could configure that custom domain to be mapped to the published APIM instance endpoints, instead of the default domain name assigned by Azure, it is also a nice way to abstract the internal implementation details from clients.

2. Logic Apps

Azure Logic Apps is a serverless workflow and integration offering which will provide the framework for designing and implementing stateless business process automation (BPA), and integration scenarios.

Here, Logic Apps layer acts as the integration decoupling layer, that isolates the internal solution from the external systems. It will define canonical message structures, will handle backend message routing, message transformations, and protocol mediation through its wide range of built-in communication protocol and LOB connectors.

Moreover, the Integration Account will be used to complement Logic Apps with advanced schema transformation capabilities for complex JSON and XML messages used by the backend systems.

The transformation artifacts: Liquid Templates, and XSLT provide transformation capabilities that surpasses the basic message transformations features provided by the native data operation actions in Logic Apps.

The Integration Account is basically used as a container for these transformation artifacts in order to be able to use them in Logic Apps.

3. Azure Functions

Azure Functions is a serverless compute offering which will be used to externalize cross-cutting logic across multiple Logic Apps instances, such as technical helper methods, extensible logic, or common business rules governing a set of business processes.

4. To Queue or not to Queue?

When you designing an integration solution, you don’t need to use each and every Azure Integration resource, you should select the resources that fit your solution context.

To answer this question, we need to ask a few questions:

  • Is there a need for ordered message delivery?
  • Is there a need for load-leveling between service providers and service consumers?
  • Is there a need for transactional read/write of messages – guaranteed message delivery?
  • Will the performance overhead from introducing a queue/topic be acceptable?
  • Is it acceptable to have Asynchronous integration style for the target business scenarios?

At that point in time and in that particular context, the answer was “No” for those questions. The solution scope at that stage was merely stateless, real-time, synchronous integrations that served web-based client applications.

Consequently, these web applications used short-lived operations that expected an immediate response, moreover, one of the key non-functional requirements was that the integration solution latency should be kept to a minimum.

It is important to highlight that by abstracting the integration layer implementation details behind the APIM, we allow for the underlying integration layer to be changed at anytime with minimal or no impact to the client applications. In this case, we could add a Service Bus queue/topics as the solution requirements evolve in the future.

This is one of the key advantages of cloud platforms; the on-demand, rapid provisioning of resources. This allows for the solution to evolve gradually by adding or removing resources as needed.

Security in depth

Security controls should be considered on the transport level, Azure workload level, and API level. This layered security approach may provide security control redundancy at the different system tiers; nevertheless, this hardens the security perimeter of the whole solution.

Azure Network Resources Security

1. Azure Firewall

Azure firewall, a highly available, scalable firewall service will be deployed on the VNET level at the entry point of the integration solution. It will provide centralized application and network connectivity rules for basic firewall protection against public internet traffic before passing through to the APIM instance.

It is important to note that Azure Firewall blocks traffic by default, and it provides rules to selectively allow web traffic.

2. On-premises data gateway

The on-premises data gateway will provide a secured communication bridge between the Logic Apps and the on-premises BizTalk Server and supported LOB systems. Check Microsoft Documentation to learn more about how this works under the hood and the supported data sources.

3. Azure Virtual Network

Azure VNET distributed denial of service (DDOS) protection feature hardens the network-level protection even further with providing protection against DDOS attacks.

The standard service tier provides additional capabilities over the basic protection, it includes machine learning algorithms that adapt protection policies, and provides detailed metrics, and alerts for the monitored traffic.

The VNET is also segmented into a public facing subnet for the APIM instance. Network Security Group (NSG) will also be used to control network traffic rules and achieve the appropriate level of isolation for the APIM subnet.

4. On-premises Firewall

The organization’s on-premises Firewall will provide the traditional, local network firewall protection for the organization’s infrastructure network and the backend systems.

It is important to note that the organization’s firewall should white-list the Logic Apps outgoing IP addresses over the secured HTTP communication port (443).

Azure Integration Resources Security

1. Azure APIM

APIM will protect the published APIs over SSL with OAuth 2.0 – OIDC Authentication scheme. I highly recommend that you read the walk-through for protecting APIM APIs using OAuth 2.0 with AAD in Microsoft Documentation.

An API subscription key could be also used for each client channel, this way client-driven APIM policies could be used, such as protection against brute force and denial of service (DOS) attacks, it could also be used to provide granular access control by selectively revoking compromised channel-related API keys.

Additionally, APIM incoming traffic could be filtered based on selected client IP ranges, or specific IPs using APIM access restriction policies. APIM could also be used to remove response header fields that can reveal internal implementation information from the underlying services.

2. Logic Apps

Azure Logic Apps has an access control feature in the Workflow Settings to restrict incoming Logic Apps calls to a specific IP ranges.

In my solution, I only allowed traffic coming from the APIM instance IP to be able to trigger the Logic Apps, any direct invocations for the logic apps will be denied.

As shown below, the trigger IP range is set with the APIM IP: [13.x.x.4 – 13.x.x.4]. Moreover, for nested Logic Apps, the nested logic app should restrict the Trigger Access Option to: [Only other Logic Apps].

Azure Key Vault will be used to securely store and access secrets, such as backend API Keys, and passwords. A Logic App instance will need to use Managed Service Identity to provide internal authentication using AAD to Azure Key Vault and be able to retrieve these secrets using Key Vault actions in Logic App, without maintaining any secrets in the Logic App code.

There is a nice technique to provide Just-in-Time (JIT) access to operational logs for the entire Logic App scope, by setting the IP Ranges For Contents to [0.0.0.0] in the Logic Apps Workflow Settings, access to operational logs becomes restricted. This way, operational logs are accessible temporarily when authorized administrators intentionally remove this setting, for debugging purposes.

Alternatively, there is a new feature (currently in preview), to selectively hide sensitive data from the operational logs for specific Logic Apps actions and triggers scopes.

3. Backend Services

For backend systems which will publish services, these services should be secured using a suitable industry-standard authentication scheme in addition to the encrypted communication channel Transport Level Security (SSL/TLS).

Additional Availability

The solution leverages Azure serverless and PaaS workloads that provide built-in high availability feature within the scope of a single Azure region. However, minimal service disruption should be expected in the case of planned or unplanned service unavailability according to Microsoft’s uptime Service-level agreement (SLA) for the selected Azure resources.

If the solution’s RTO requested is lower than what is provided by the single region SLA, then a disaster recovery strategy could be leveraged by introducing redundancy across multiple geographical regions.

This strategy minimizes the service disruption even further, and increases the solution’s resiliency in case of a region-wide outage in the solution’s primary region.

Azure Integration Services serverless workloads will be deployed to a secondary region in a standby mode, consumption-based resources will cost nothing when sitting idle.

For APIM, you will need 2 APIM instances, however, if you use the premium tier in subscription model, it supports multi-region deployment feature. Alternatively, you could use the newly introduced APIM consumption-based model, if its set of features are acceptable.

The cross-region failover mechanism will be accomplished using Azure Traffic Manager that is configured using [Priority Profile] traffic-routing setting.

In case of a region-wide failure in the primary region, the traffic manager will automatically failover to the secondary region by routing all incoming traffic to the secondary APIM instance endpoint and its underlying integration workloads hosted in the secondary region.

Further, Azure traffic manager will also handle automatic failback to the primary region when the primary APIM instance is back online and the underlying region will be reinstated as the active one.

If you will use a custom domain for your published APIs, you will need to configure it on the Traffic Manager instead of the APIM instances, when using this DR setup.

Final Thoughts

It is really exciting to be working in cloud migration/modernization projects like this one, it is very important that you understand the integration toolset provided by Azure, and that you work with your customer to to make sure that the movement to the cloud is really justified.

Although Azure platform takes care of the main cross cutting concerns for the platform itself, don’t forget that you still have responsibilities regarding these concerns for your cloud-hosted solution, such as monitoring, security, and cross-region availability.

The migration approach described in this article is not a one size fits all, it will definitely be tailored given each organization’s readiness, existing landscape, functional and non-functional requirements.

Following a gradual migration approach and reaping the benefits of cloud technologies will allow the solution to start with the right resources and naturally evolve to adapt to the organization’s changing needs and new opportunities along the way.