Azure AI Language Retry policy: Reactive or Proactive

Umer Rashid 150 Reputation points
2025-04-17T13:16:39.2666667+00:00

Hello,

As I understand, Azure AI Language can deal with short outbursts.

I am wondering if I should enforce retry policy whenever API rate limit is reached or let Azure handle short outbursts and enforce retry policy with exponential backoff only after receiving HTTP 429 error.

Please let me know what is the recommended approach to handle rate limits.

Regards,

Umer

Azure AI Language
Azure AI Language
An Azure service that provides natural language capabilities including sentiment analysis, entity extraction, and automated question answering.
488 questions
{count} votes

Accepted answer
  1. Manas Mohanty 3,545 Reputation points Microsoft External Staff
    2025-04-17T16:46:06.7+00:00

    Hi Umer Rashid

    The default or custom retry policy will enable retry policy automatically if the client face 429 rate limits. But if you want to wait out the start of next minute based on 429 error we are receiving, please check out below code.

    For e.g.

    If you send 300 requests initially, followed by another 300 requests 5 seconds later, and then 400 requests 5 seconds after that, you will have sent a total of 1000 requests within a 10-second window, you have to 50 second as the error will suggest.

    You can trigger client layered with custom retry after 50 second.

    Assumed error response.

    {'error': {'code': '429', 'message': 'Rate limit is exceeded. Try again in 50 seconds.'}}"
    
    

    Below code trims the wait time 50 out of above error message then use sleep time of 50 second then retry until maximum retry has been reached.

    import time
    from azure.core.pipeline.policies import RetryPolicy
    from azure.ai.textanalytics import TextAnalyticsClient
    from azure.core.credentials import AzureKeyCredential
    from azure.core.exceptions import HttpResponseError
    
    # Define the retry policy
    retry_policy = RetryPolicy(
        retry_total=5,  # Total number of retries
        retry_backoff_factor=0.8,  # Factor to calculate the backoff time
        retry_backoff_max=16,  # Maximum backoff time in seconds
        retry_mode='exponential'  # Use exponential backoff
    )
    
    # Create the Text Analytics client with the retry policy
    client = TextAnalyticsClient(
        endpoint=endpoint,
        credential=AzureKeyCredential(api_key),
        retry_policy=retry_policy
    )
    
    def analyze_text(client, documents):
        for attempt in range(retry_policy.retry_total):
            try:
                response = client.analyze_sentiment(documents=documents)
                return response
            except HttpResponseError as e:
                if e.status_code == 429:
                    error_message = e.response.json().get('error', {}).get('message', '')
                    if 'Rate limit is exceeded' in error_message:
                        retry_after = int(error_message.split('in ').split(' seconds'))
                        print(f"Rate limit exceeded. Retrying after {retry_after} seconds.")
                        time.sleep(retry_after)
                    else:
                        retry_after = int(e.response.headers.get("Retry-After", 1))
                        print(f"Rate limit exceeded. Retrying after {retry_after} seconds.")
                        time.sleep(retry_after)
                else:
                    raise e
    
    # Example usage
    documents = ["Hello world!"]
    response = analyze_text(client, documents)
    print(response)
    
    
    
    

    Hope it adds more clarity now

    Thank you.

    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.