How do I configure an AudioConfig for an audio without having it stored on the local/server or using the tempral or any storage

Thomas Pineda Tamayo 0 Reputation points
2025-05-06T16:17:11.0533333+00:00

I am using python to create an application in Django to evaluate pronunciation of long audios (10 min approx), but I need to be able to evaluate the pronunciation without saving the audio on my local or server (once I deploy it), but I can't figure out how to process this in memory or stream without storage.

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
2,000 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Pavankumar Purilla 7,100 Reputation points Microsoft External Staff Moderator
    2025-05-06T20:10:40.0466667+00:00

    Hi Thomas Pineda Tamayo,

    To configure an AudioConfig in Python for evaluating pronunciation of long audio files (~10 minutes) in a Django application without saving the audio to local or server storage, you can use Azure's PushAudioInputStream. Instead of writing the uploaded file to disk, you can read the in-memory uploaded audio file (from request.FILES) in chunks and stream it directly into the PushAudioInputStream, which acts as a live audio source for Azure's Speech SDK. You then create an AudioConfig using AudioConfig.from_stream_input(stream) and pass it to the SpeechRecognizer. This approach keeps everything in memory without requiring temporary files or external storage, enabling real-time or near-real-time processing of large audio files entirely within your application’s memory space. This method is fully supported by the Azure Speech SDK and works well for long-form audio as long as sufficient memory is available.

    I hope this information helps.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.