Edit

Share via


Reliability guides by service

This article provides links to reliability guidance for many Azure services. Most reliability guides contain the following information:

  • Reliability architecture overview is a synopsis of how the service supports reliability, including information about which components are managed by Microsoft and which are managed by you, any built-in redundancy features, and how to provision and manage multiple resources, if applicable.
  • Transient fault handling details how the service handles normal day-to-day transient faults that can occur in the cloud and include information on how to handle these faults in your application. This includes information on retry policies, timeouts, and other best practices for handling transient faults.
  • Availability zones such as zonal and zone-redundant deployment options, traffic routing and data replication between zones, what happens if a zone experiences an outage, failback, and how to configure your resources for availability zone support.
  • Multi-region support such as how to configure multi-region or geo-disaster support, traffic routing and data replication between regions, region-down experience, failover and failback support, alternative multi-region support.

Some guides also contain information on:

  • Backup support such as who controls backups, where they are stored and replicated to, how they can be recovered, and whether they are accessible only within a region or across regions.
  • Service level agreements for availability, including how the expected uptime changes based on the configuration you use.

Reliability guides by service

This section provides links to reliability guidance for many Azure services. Each service guide contains information on how the service supports reliability features.

Note

Some service documents are in the process of, or are not yet updated into a single reliability guide format. These may contain more than one document that references reliability guidance.

Product Reliability Guide Other Reliability Documentation
Azure AI Health Insights Reliability in Azure AI Health Insights
Azure AI Search Reliability in Azure AI Search
Azure API Center Reliability in Azure API Center
Azure API Management Ensure API Management availability and reliability

How to implement disaster recovery using service backup and restore
Azure App Configuration How does App Configuration ensure high data availability?

Resiliency and disaster recovery
Azure Application Gateway (V2) Autoscaling and High Availability
Azure Application Gateway for Containers Reliability in Azure Application Gateway for Containers
Azure API for FHIR® Disaster recovery for Azure API for FHIR
Azure API Management Ensure API Management availability and reliability

How to implement disaster recovery using service backup and restore
Azure App Service Reliability in Azure App Service
Azure Backup Reliability in Azure Backup
Azure Batch Reliability in Azure Batch
Azure Bastion Reliability in Azure Bastion
Azure Bot Service Reliability in Azure Bot Service
Azure Cache for Redis Enable zone redundancy for Azure Cache for Redis

Configure passive geo-replication for Premium Azure Cache for Redis instances
Azure Chaos Studio Reliability in Azure Chaos Studio
Azure Communications Gateway Reliability in Azure Communications Gateway
Azure Container Apps Reliability in Azure Container Apps
Azure Container Instances Reliability in Azure Container Instances
Azure Container Registry Enable zone redundancy in Azure Container Registry for resiliency and high availability

Geo-replication in Azure Container Registry
Azure Cosmos DB for NoSQL Reliability in Azure Cosmos DB for NoSQL
Azure Cosmos DB for MongoDB vCore Reliability in Azure Cosmos DB for MongoDB vCore
Azure Cosmos DB for PostgreSQL Availability zone outage resiliency in Azure Cosmos DB for PostgreSQL

High availability in Azure Cosmos DB for PostgreSQL
Azure Databox How can I recover my data if an entire region fails?
Azure Data Explorer Business continuity and disaster recovery overview
Azure Data Factory Reliability in Azure Data Factory
Azure Data Manager for Energy Reliability in Azure Data Manager for Energy
Azure Data Share Disaster recovery for Azure Data Share
Azure Database for MySQL Overview of business continuity with Azure Database for MySQL - Single Server
Azure Database for MySQL - Flexible Server Azure Database for MySQL Flexible Server High availability

Azure Database for MySQL Flexible Server - Restore to latest restore point
Azure Database for PostgreSQL - Flexible Server Reliability in Azure Database for PostgreSQL - Flexible Server
Azure Deployment Environments Reliability in Azure Deployment Environments
Azure Device Registry Reliability in Azure Device Registry
Azure DevOps Data availability
Azure Disk Encryption Redundancy options for managed disks
Azure Disks Best practices for achieving high availability with Azure virtual machines and managed disks
Azure DNS Reliability in Azure DNS
Azure DDoS Protection Reliability in Azure DDoS Protection
Azure Elastic SAN Reliability in Azure Elastic SAN
Azure Event Grid Reliability in Azure Event Grid
Azure Event Hubs Reliability in Azure Event Hubs
Azure ExpressRoute Designing for high availability with ExpressRoute

Designing for disaster recovery with ExpressRoute private peering
Azure Firewall Deploy an Azure Firewall with Availability Zones using Azure PowerShell
Azure Files Choose the right redundancy option

Disaster recovery and failover for Azure Files
Azure Functions Reliability in Azure Functions
Azure Guest Configuration Azure Guest Configuration Availability
Azure Health Data Services: De-identification service (preview) Reliability in Azure Health Data Services: De-Identification service
Azure Health Data Services: Workspace services (FHIR®, DICOM®, MedTech) Business continuity and disaster recovery considerations
Azure HDInsight Reliability in Azure HDInsight
Azure IoT Hub IoT Hub high availability and disaster recovery
Azure Key Vault Azure Key Vault availability and redundancy
Azure Kubernetes Service (AKS) Reliability in Azure Kubernetes Service (AKS)
Azure Load Balancer Reliability in Azure Load Balancer
Azure Logic Apps Reliability in Azure Logic Apps
Azure Machine Learning Service Failover for business continuity and disaster recovery
Azure Media Services High Availability with Media Services and Video on Demand (VOD)
Azure Migrate Does Azure Migrate offer Backup and Disaster Recovery?
Azure Monitor Logs Enhance data and service resilience in Azure Monitor Logs with availability zones

Azure Monitor Logs workspace replication
Azure Notification Hubs Reliability in Azure Notification Hubs
Azure NetApp Files Manage disaster recovery using Azure NetApp Files
Azure Network Watcher Azure Network Watcher service availability and redundancy
Azure Operator Nexus Reliability in Azure Operator Nexus
Azure Private 5G Core Reliability in Azure Private 5G Core
Azure Private Link Azure Private Link availability
Azure Public IP Azure Public IP Availability Zone
Azure Route Server Azure Route Server frequently asked questions (FAQ)
Azure Service Bus Best practices for insulating applications against Service Bus outages and disasters
Azure Service Fabric Deploy an Azure Service Fabric cluster across Availability Zones

Disaster recovery in Azure Service Fabric
Azure SignalR Service Resiliency and disaster recovery in Azure SignalR Service
Azure Site Recovery Set up disaster recovery for Azure VMs
Azure Spring Apps Reliability in Azure Spring Apps
Azure SQL Database Azure SQL Database - High availability

Disaster recovery guidance - Azure SQL Database
Azure SQL Managed Instance Failover groups overview & best practices - Azure SQL Managed Instance
Azure Storage Actions Reliability in Azure Storage Actions
Azure Storage - Blob Storage Choose the right redundancy option

Azure storage disaster recovery planning and failover
Azure Storage Mover Reliability in Azure Storage Mover
Azure Stream Analytics Achieve geo-redundancy for Azure Stream Analytics jobs
Azure Traffic Manager Reliability in Azure Traffic Manager
Azure Virtual Machines Reliability in Azure Virtual Machines
Azure Virtual Machine Image Builder Reliability in Azure Virtual Machine Image Builder
Azure Virtual Machine Scale Sets Reliability in Azure Virtual Machine Scale Sets
Azure Virtual Network Virtual networks and availability zones

Virtual Network – Business Continuity
Azure Virtual WAN How are Availability Zones and resiliency handled in Virtual WAN?

Disaster recovery design
Azure VMware Solution Deploy disaster recovery using VMware HCX
Azure VPN Gateway About zone-redundant virtual network gateway in Azure availability zones

Highly Available cross-premises and VNet-to-VNet connectivity
Azure Web Application Firewall Deploy an Azure Firewall with Availability Zones using Azure PowerShell

How do I achieve a disaster recovery scenario across datacenters by using Application Gateway?
Microsoft Community Training Reliability in Microsoft Community Training
Microsoft Fabric Reliability in Microsoft Fabric
Microsoft Purview Reliability in Microsoft Purview
Sustainability Data Solutions in Fabric Reliability in Sustainability Data Solutions in Fabric