Azure VPN Gateway Connection Traffic suddenly stops

Ming Yu 20 Reputation points
2025-04-30T15:12:24.1133333+00:00

We have an azure VPN gateway with a connection to an on-site premise device.

The configuration has been setup for years and it's been working consistently and reliably for long time.

Recently, we notice that the the services deployed in our AKS cluster seemed to have failed DB connection when attempting to access the DB over the VPN tunnel.

When that happened, we noticed that the traffic across the VPN tunnel literally stopped.

But if we reset the VPN gateway, once the VPN connection is reestablished, the DB connection access resumed.

According to the health resource of the VPN gateway, there was no health issue detected while the network traffic over the VPN stopped.

Can anyone help us troubleshoot to see what might have been happening when the VPN traffic suddenly stopped?

I tried to enable diagnostic logs for the VPN gateway following https://learn.microsoft.com/en-us/azure/vpn-gateway/troubleshoot-vpn-with-azure-diagnostics.

I can see that GatewayDiagnosticLog, TunnelDiagnosticLog, IKEDiagnosticLog, etc are enabled by running command "az monitor diagnostic-settings list --resource <vpn_gateway_resource_id>".

However, I can't find azure tables in the log analytics workspace that is used to enable diagnostic settings.

Thanks for the help.

Azure VPN Gateway
Azure VPN Gateway
An Azure service that enables the connection of on-premises networks to Azure through site-to-site virtual private networks.
1,726 questions
{count} votes

Accepted answer
  1. Sindhuja Dasari 630 Reputation points Microsoft External Staff Moderator
    2025-05-05T18:02:51.42+00:00

    Hello Ming Yu

    Thanks for your response. If you’re seeing only IKEDiagnosticLog in the AzureDiagnostics table and no other categories like TunnelDiagnosticLog or GatewayDiagnosticLog, it most likely means those logs haven’t been generated yet — either due to:

    • No tunnel events/errors to trigger them,
    • Or insufficient VPN activity/state changes.

    When VPN Traffic Drops to Zero, key Log Categories to Focus On

    If traffic drops suddenly, here’s what to look for, in order of relevance:

    TunnelDiagnosticLog

    Primary log for tunnel-level issues:

    • Tunnel up/down events
    • Disconnection or dead-peer detection
    • Negotiation failures beyond IKE phase

    IKEDiagnosticLog

    Covers Phase 1 negotiation (authentication, encryption handshake):

    • Useful for initial connection or rekeying issues.
    • If tunnel fails to establish at all, this is key.

    GatewayDiagnosticLog

    Covers high-level operational events:

    • Gateway resets
    • Configuration changes
    • Health probe failures (can hint at misconfigurations)
    • This log is less detailed but still useful for detecting infrastructure issues.

    RouteDiagnosticLog

    Only populated when:

    • BGP is enabled and dynamic routes are being exchanged
    • There are route flaps or issues with learned/published prefixes Use if your gateway is route-based with BGP (common with ExpressRoute + VPN coexistence).

    P2SDiagnosticLog

    Only relevant if you’re using Point-to-Site (P2S) VPN connections — not applicable for Site-to-Site setups.

    Refer Troubleshoot-vpn-with-azure-diagnostics which provides detailed information about the logs.

    What to query when Traffic Stops:

    1.Query for tunnel status:

    AzureDiagnostics | where Category == "TunnelDiagnosticLog" | where Message contains "Tunnel is down" | sort by TimeGenerated desc 
    
    1. Query for IKE issues (e.g., rekey failures, cert/auth issues):
    AzureDiagnostics | where Category == "IKEDiagnosticLog" | where Message contains "failure" or Message contains "error" | sort by TimeGenerated desc
    
    1. Check if gateway itself had issues:
    AzureDiagnostics | where Category == "GatewayDiagnosticLog" | sort by TimeGenerated desc
    
    

    If none of these logs show recent activity around the time of the traffic drop, it likely means:

    No events were triggered (e.g., the tunnel is still technically up, but routes are broken)

    The issue is on the on-prem side (e.g., firewall, routing, dead peer, etc.)


    Please don’t forget to close the thread by clicking "Accept the answer" and "Yes" wherever the information provided helps you, as this can be beneficial to other community members.

    1 person found this answer helpful.
    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. UJTyagi-MSFT 1,010 Reputation points Microsoft Employee
    2025-05-02T10:40:12.3966667+00:00

    @Ming Yu

    Welcome to the Microsoft Q&A Platform. Thank you for reaching out & I hope you are doing well.

    • The individual log categories like TunnelDiagnosticLog or IKEDiagnosticLog are stored as entries (rows) within the Azure Diagnostics table, not as standalone tables.
    • Hence your KQL should query Azure Diagnostics table for categories TunnelDiagnosticLog or IKEDiagnosticLog
    • Run the below KQL on your log analytics workspace to see if any VPN related logs exist:
    AzureDiagnostics
    | where ResourceType == "VIRTUALNETWORKGATEWAYS"
    | where Category in ("GatewayDiagnosticLog", "TunnelDiagnosticLog", "RouteDiagnosticLog", "IKEDiagnosticLog")
    | sort by TimeGenerated desc
    
    • User's image
    • If you don't see any logs, make sure the diagnostic settings are configured correctly to send logs to the right Log Analytics workspace.
    • Run the below command on the az cli and note down the workspaceId from the output.
        az monitor diagnostic-settings show --resource <vpn_gateway_resource_id>
        
        
      
    • workspace id shown should match the correct workspace you're querying.
    • You can also validate this by going to Azure Monitor > Diagnostics settings > select the right VPN gateway as shown below.
    • User's image
    • Make sure your user/account has Log Analytics Reader or Contributor permission on the workspace.
    • Other factors to consider - If the VPN hasn't experienced any recent disconnections or IKE events, those categories may not emit logs yet. Try triggering traffic or simulate a disconnection to prompt logs. Also don't forget to wait for 30-60 minutes before logs start appearing in log analytics workspace.

    If the below answer addressed your query, please don’t forget to click "Accept the answer" and Up-Vote for the same, which might be beneficial to other community members reading this thread. And, if you have any further query do let us know.

    Regards

    Ujjawal Tyagi

    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.