From your error message "Skill did not execute within the time limit", the problem is almost certainly happening within one of the skills, likely:
-
OcrSkill
(which can be slow on image-heavy PDFs) orMergeSkill
(which depends on large inputs and potentially offset alignment).
This timeout isn't about the total blob size or indexer schedule; it's due to the skill pipeline’s execution duration per document.
You can raise Timeout for Your Custom Skills to a max of 230 seconds by updating the skillset definition using the REST API with the timeout
property inside the skill. Custom skill timeout
Example:
"timeout": "PT180S" // ISO 8601 format: 3 minutes
You’d add this to each skill where needed (particularly OcrSkill
and MergeSkill
).
The indexer processes documents in batches. If multiple large or complex documents are processed at once, the cognitive skills pipeline may timeout. You can reduce the batch size like this:
"parameters": {
"batchSize": 1
}
This slows down processing a bit but improves reliability. Indexer parameters
Although your SplitSkill
handles text splitting, you might consider preprocessing large PDFs (especially scanned ones or image-heavy files) before upload, splitting them by page or section to reduce indexing load.
Bulk uploading many files at once may have created a processing queue internally. Even though deleted blobs don’t get indexed again, their metadata might still exist depending on the change detection policy. To avoid this, make sure change tracking is properly configured (or disabled if not needed) and temporarily pause uploads and allow the indexer to catch up.
Change detection policies
Hope this helps, let me know if you have any further queries.