Edit

Share via


Troubleshoot link - Azure SQL Managed Instance

Applies to: Azure SQL Managed Instance

This article teaches you how to monitor and troubleshoot issues with a link between SQL Server and Azure SQL Managed Instance.

You can check the state of the link with Transact-SQL (T-SQL), Azure PowerShell or the Azure CLI. If you encounter issues, you can use the error codes to troubleshoot the problem.

Many issues with creating the link can be resolved by checking the network between the two instances, and validating the environment has been properly prepared for the link.

Initial seeding

When establishing a link between SQL Server and Azure SQL Managed Instance, there's an initial seeding phase before data replication starts. The initial seeding phase is the longest and most expensive part of the operation. Once initial seeding completes data is synchronized, and only subsequent data changes are replicated. The time it takes for the initial seeding to complete depends on the size of data, workload intensity on the primary databases, and the speed of the link between networks of the primary and secondary replicas.

If the speed of the link between the two instances is slower than what is necessary, the time to seed is likely to be noticeably affected. You can use the stated seeding speed, total size of data, and the link speed to estimate how long the initial seeding phase will take before data replication starts. For example, for a single 100-GB database, the initial seed phase would take about 1.2 hours if the link is capable of pushing 84 GB per hour, and if there are no other databases being seeded to a different link. If the link can only transfer 10 GB per hour, then seeding a 100-GB database can take about 10 hours. If there are multiple databases to replicate via multiple links, seeding will be executed in parallel, and, when combined with a slow link speed, the initial seeding phase might take considerably longer, especially if the parallel seeding of data from all databases exceeds the available link bandwidth.

Important

The initial seeding phase can take days with extremely low-speed or busy links. In this case, creating the link can time out. Creating the link is automatically canceled after 6 days.

If you run into issues with a link, you can use SQL Server Management Studio (SSMS), Transact-SQL (T-SQL), Azure PowerShell or the Azure CLI to get information about the current state of the link.

Use T-SQL for a quick status details of the link state, and then use Azure PowerShell or the Azure CLI for a comprehensive information about the current state of the link.

Link monitoring is available starting with SQL Server Management Studio (SSMS) 21.0 (preview).

To check the link state in SSMS, follow these steps:

  1. Connect to a replica that hosts the link.

  2. In Object Explorer, expand Always On High Availability, and then expand Availability Groups.

  3. Right-click the name of the link, and then select Properties to open the Link properties window:

    Screenshot of the right-click menu on a link in SSMS, with properties highlighted.

  4. The Link properties window displays useful information about the link, such as replica information, link state, and the endpoint certificate expiration date:

    Screenshot of the link properties window in SSMS.

The replicaState value describes the current link. If the state also includes Error then an error occurred during the operation listed in the state. For example, LinkCreationError indicates that an error occurred while creating the link.

Some possible replicaState values are:

  • CreatingLink: Initial seeding
  • LinkSynchronizing: Data replication is in progress
  • LinkFailoverInProgress: Failover is in progress

For a complete list of link state properties, review the Distributed Availability Groups - GET REST API command.

There are two distinct categories of errors you can encounter when using the link - errors when you try to initialize the link, and errors when you try to create the link.

The following error can occur when initializing a link (Link state: LinkInitError):

The following errors can occur when creating a link (Link state: LinkCreationError):

  • Error 41977: The target database isn't responsive. Check link parameters and try again.

  • Premature log truncation: If the transaction log is truncated before the initial seeding finishes, you are likely to see one of the following errors:

    • Error 1408: The remote copy of database "%.*ls" is not recovered far enough to enable database mirroring or to join it to the availability group.
    • Error 1412: The remote copy of database "%.*ls" has not been rolled forward to a point in time that is encompassed in the local copy of the database log.

    To resolve this issue, you must drop and recreate the link.
    To avoid this issue, pause transaction log backups on SQL Server for database being replicated during the initial seeding phase.

Inconsistent state after forced failover

Following a forced failover, you might encounter a split-brain scenario where both replicas are in the primary role, leaving the link in an inconsistent state. This can happen if you fail over to the secondary replica during a disaster, and then the primary replica comes back online.

First, confirm you're in a split-brain scenario. You can do so by using SQL Server Management Studio (SSMS) or Transact-SQL (T-SQL).

Connect to both SQL Server and SQL managed instance in SSMS, and then in Object Explorer, expand Availability replicas under the Availability group node in Always On High Availability. If two different replicas are listed as (Primary), you're in a split-brain scenario.

Alternatively, you can run the following T-SQL script on both SQL Server and SQL Managed Instance to check the role of the replicas:

-- Execute on SQL Server and SQL Managed Instance 
USE master
DECLARE @link_name varchar(max) = '<DAGName>'
SELECT
   ag.name [Link name], 
   rs.role_desc [Link role] 
FROM
   sys.availability_groups ag 
   JOIN sys.dm_hadr_availability_replica_states rs 
   ON ag.group_id = rs.group_id 
WHERE 
   rs.is_local = 1 AND ag.is_distributed = 1 AND ag.name = @link_name 
GO

If both instances list PRIMARY in Link role column, you're in a split-brain scenario.

To resolve the split brain state, first take a backup on whichever replica was the original primary. If the original primary was SQL Server, then take a tail log backup. If the original primary was SQL Managed Instance, then take a copy-only full backup. After the backup completes, set the distributed availability group to the secondary role for the replica that used to be the original primary but will now be the new secondary.

For example, in the event of a true disaster, assuming you've forced a failover of your SQL Server workload to Azure SQL Managed Instance, and you intend to continue running your workload on SQL Managed Instance, take a tail log backup on SQL Server, and then set the distributed availability group to the secondary role on SQL Server such as the following example:

--Execute on SQL Server 
USE master
ALTER AVAILABILITY GROUP [<DAGName>] 
SET (ROLE = SECONDARY) 
GO 

Next, execute a planned manual failover from SQL Managed Instance to SQL Server by using the link, such as the following example:

--Execute on SQL Managed Instance 
USE master
ALTER AVAILABILITY GROUP [<DAGName>] FAILOVER 
GO 

Expired certificate

It's possible for the certificate used for the link to expire. If the certificate expires, the link fails. To resolve this issue, rotate the certificate.

Test network connectivity

Bidirectional network connectivity between SQL Server and SQL Managed Instance is necessary for the link to work. After you open ports on the SQL Server side and configure an NSG rule on the SQL Managed Instance side, test connectivity by using either SQL Server Management Studio (SSMS) or Transact-SQL.

Test the network by creating a temporary SQL Agent job on both SQL Server and SQL Managed Instance to check the connection between the two instances. When you use Network Checker in SSMS, the job is automatically created for you, and deleted after the test completes. You need to manually delete the SQL Agent job if you test your network by using T-SQL.

Note

Executing PowerShell scripts by the SQL Server Agent on SQL Server on Linux is not currently supported, so it's not currently possible to execute Test-NetConnection from the SQL Server Agent job on SQL Server on Linux.

To use the SQL Agent to test network connectivity, you need the following requirements:

  • The user doing the test must have permissions to create a job (either as a sysadmin or belongs to the SQLAgentOperator role for msdb) for both SQL Server and SQL Managed Instance.
  • The SQL Server Agent service must be running on SQL Server. Since the Agent is on by default on SQL Managed Instance, no additional action is necessary.

To test network connectivity between SQL Server and SQL Managed Instance in SSMS, follow these steps:

  1. Connect to the instance that will be the primary replica in SSMS.

  2. In Object Explorer, expand databases, and right-click the database you intend to link with the secondary. Select Tasks > Azure SQL Managed Instance link > Test Connection to open the Network Checker wizard:

    Screenshot of object explorer in S S M S, with test connection selected in the database link right-click menu.

  3. Select Next on the Introduction page of the Network Checker wizard.

  4. If all requirements are met on the Prerequisites page, select Next. Otherwise resolve any unmet prerequisites, and then select Re-run Validation.

  5. On the Login page, select Login to connect to the other instance that will be the secondary replica. Select Next.

  6. Check details on the Specify Network Options page and provide an IP address, if necessary. Select Next.

  7. On the Summary page, review the actions the wizard takes and then select Finish to test the connection between the two replicas.

  8. Review the Results page to validate connectivity exists between the two replicas, and then select Close to finish.

Caution

Proceed with the next steps only if you've validated network connectivity between your source and target environments. Otherwise, troubleshoot network connectivity issues before proceeding.

For more information on the link feature, review the following resources: