What’s the Best Way to Handle Interruptions in Azure Communication Services?
Hi all,
We’re building a voice-based AI assistant using Azure Communication Services (ACS), and we're exploring how to enable seamless interruption handling during calls.
Use Case:
We want the assistant to speak using TTS (e.g., play_media(...)
) and be interrupted mid-sentence if the user begins speaking — just like a natural phone conversation. This is essential for delivering a fluid and responsive experience in real-world scenarios like customer support or healthcare coordination.
Questions:
What are the recommended patterns or best practices in ACS for enabling interruptible speech playback?
Is there any way to implement barge-in-like behavior using the Python SDK or REST APIs?
Are there any newer or upcoming versions of azure-communication-callautomation
that introduce native support for this (e.g., play_and_recognize()
with bargeInAllowed
)?
Are there any working examples or documented approaches for building responsive, real-time voice interactions with interruption support in ACS?
We’ve experimented with start_recognizing_media(...)
in parallel with play_media(...)
and monitoring for events to simulate interruption, but that seems limited or unsupported in the current SDK version (1.4.0b1
).
We’re looking for Azure’s current and future guidance on handling this pattern reliably.
Thanks in advance!