429 Rate Limit Errors on GPT=4.1
I am getting 429 Rate Limit errors on an Azure OpenAI gpt-4.1 resource; the details for this resource, as shown in Azure AI Foundry, are:
Rate Limit: 721,000 TPM
Requests: 721 RPM
But it is capped at 30K for some reason.
status_code: 429, model_name: gpt-4.1, body: {'message': 'Request too large for gpt-4.1 in organization org-<snip> on tokens per min (TPM): Limit 30000, Requested 42638. The input or output tokens must be reduced in order to run successfully. Visit https://platform.openai.com/account/rate-limits to learn more.', 'type': 'tokens', 'param': None, 'code': 'rate_limit_exceeded'}