Hi Tanmay Sharma,
Since the node is stuck in the "rebooting" state and cannot be reimaged or deleted due to auto-scaling, try these steps:
Disable Auto-Scaling Temporarily:
az batch pool autoscale disable --pool-id
OSAthresholdS_v2
After disabling auto-scale, try deleting the node:
az batch node delete --pool-id OSAthresholdS_v2 --node-id tvmps_8f4c7b22047eb71169466eb201a4437548f517ea99832a76a7dc3a564307646c_d
Once removed, re-enable auto-scaling if needed.
Manually Resize the Pool (Alternative to Deletion):
az batch pool resize --pool-id OSAthresholdS_v2 --target-dedicated-nodes <new_count>
This forces Azure to replace the problematic node.
Use Azure REST API for Forceful Removal: If CLI deletion still fails, try using the Batch Node Removal API: https://learn.microsoft.com/en-us/rest/api/batchservice/pool/remove-nodes?view=rest-batchservice-2024-07-01&tabs=HTTP
Refer to the below document:
Disable Auto-Scaling: https://learn.microsoft.com/en-us/azure/batch/batch-automatic-scaling#disable-autoscale
If you have any further queries, please let us know we are glad to help you.
If it was helpful, please click "Upvote" on this post to let us know.
Thank You.