Hello Sunil Nagireddy,
Thank you for posting your question in the Microsoft Q&A forum.
The error indicates your Azure OpenAI embeddings API is hitting rate limits (HTTP 429) when processing multiple PDFs, despite their small size. This occurs because the default S0 pricing tier has strict call rate restrictions.
To resolve this, please increase your TPM on model deployment via the deployment/endpoint Azure portal to increase your TPM (token per minute) quota.
Alternatively, implement client-side retry logic with exponential backoff in your indexing script to handle transient throttling gracefully.
For batch processing, pre-chunk documents into smaller segments before sending them to the embeddings API, reducing concurrent requests. Additionally, verify your text extraction method, tables in PDFs may generate excessive chunks, triggering rate limits. If possible, preprocess tables into structured text before embedding.
For production-scale indexing, consider asynchronous processing with queue-based workflows (e.g., Azure Functions + Storage Queues) to distribute load. Monitor usage via Azure OpenAI’s metrics blade to align capacity with demand. If these steps fail, request a quota increase via the linked form in the error message. Structured batching and tier upgrades typically resolve such throughput issues.
If the above answer helped, please do not forget to "Accept Answer" as this may help other community members to refer the info if facing a similar issue. Your contribution to the Microsoft Q&A community is highly appreciated.