I am trying to finetune and get better results from Azure Speech to Text

Farida Bharmal 0 Reputation points
2025-05-05T15:58:50.1+00:00

I am using Azure AI Playground to train a model for speech recognition. I have the audio files that I need to get accurate transcription. However, when I upload my data (audio file zip), I get error "the dataset is not suitable for this operation"

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
2,001 questions
{count} votes

1 answer

Sort by: Most helpful
  1. JAYA SHANKAR G S 2,525 Reputation points Microsoft External Staff Moderator
    2025-05-08T12:40:44.32+00:00

    Hello @Farida Bharmal ,

    Usually, the error the dataset is not suitable for this operation occurs if Maximum length per audio is not satisfied.

    So, if you are trying

    1. Audio data for training or testing make sure the length of the audio per file not exceeding 60 seconds for training and 2 hours for testing
    2. Audio + human-labeled transcript data for training or testing make sure the length of the audio per file is Two hours (testing) / 40 s (training).

    Please check this and let us know if you have any query.

    Thank you


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.