Azure AI Search Indexer Timeout Issue with PDF Document

Su Myat Hlaing 160 Reputation points
2025-04-24T06:00:05.21+00:00

I'm experiencing persistent indexer timeout issues with my Azure AI Search setup. Here's my workflow:

  1. Node server uploads files to Azure Blob Storage
  2. Azure Function triggers automatically on blob upload to index the document
  3. Function executes successfully (confirmed in logs, ~1056ms duration)
  4. However, the indexer consistently times out

Despite resetting and manually running the indexer multiple times, it continues to timeout. The document in question is a PDF file, and I cannot search with this file in Azure AI Search at all. Interestingly, this setup was working properly on previous days, so I'm not sure if the timeout is the actual problem or if something else is happening.

Any troubleshooting suggestions or recommended configuration adjustments would be greatly appreciated.

Thank you!
User's image

User's image

Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
1,283 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Bhargavi Naragani 3,165 Reputation points Microsoft External Staff
    2025-04-24T08:40:42.61+00:00

    @Su Myat Hlaing

    From your error message "Skill did not execute within the time limit", the problem is almost certainly happening within one of the skills, likely:

    • OcrSkill (which can be slow on image-heavy PDFs) or MergeSkill (which depends on large inputs and potentially offset alignment).

    This timeout isn't about the total blob size or indexer schedule; it's due to the skill pipeline’s execution duration per document.

    You can raise Timeout for Your Custom Skills to a max of 230 seconds by updating the skillset definition using the REST API with the timeout property inside the skill. Custom skill timeout

    Example:

    "timeout": "PT180S" // ISO 8601 format: 3 minutes
    

    You’d add this to each skill where needed (particularly OcrSkill and MergeSkill).

    The indexer processes documents in batches. If multiple large or complex documents are processed at once, the cognitive skills pipeline may timeout. You can reduce the batch size like this:

    "parameters": {
      "batchSize": 1
    }
    

    This slows down processing a bit but improves reliability. Indexer parameters

    Although your SplitSkill handles text splitting, you might consider preprocessing large PDFs (especially scanned ones or image-heavy files) before upload, splitting them by page or section to reduce indexing load.

    Bulk uploading many files at once may have created a processing queue internally. Even though deleted blobs don’t get indexed again, their metadata might still exist depending on the change detection policy. To avoid this, make sure change tracking is properly configured (or disabled if not needed) and temporarily pause uploads and allow the indexer to catch up.
    Change detection policies

    Hope this helps, let me know if you have any further queries.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.