VoiceRAG: An App Pattern for RAG + Voice Using Azure AI Search and the GPT-4o Realtime API for Audio

13 min read Original article ↗

Blog Post

Microsoft Foundry Blog

5 MIN READ

pablocastro's avatar

The new Azure OpenAI gpt-4o-realtime-preview model opens the door for even more natural application user interfaces with its speech-to-speech capability.

This new voice-based interface also brings an interesting new challenge with it: how do you implement retrieval-augmented generation (RAG), the prevailing pattern for combining language models with your own data, in a system that uses audio for input and output?

In this blog post we present a simple architecture for voice-based generative AI applications that enables RAG on top of the real-time audio API with full-duplex audio streaming from client devices, while securely handling access to both model and retrieval system.

Architecting for real-time voice + RAG

Supporting RAG workflows

We use two key building blocks to make voice work with RAG:

  • Function calling: the gpt-4o-realtime-preview model supports function calling, allowing us to include “tools” for searching and grounding in the session configuration. The model listens to audio input and directly invokes these tools with parameters that describe what it’s looking to retrieve from the knowledge base.
  • Real-time middle tier: we need to separate what needs to happen in the client from what cannot be done client-side. The full-duplex, real-time audio content needs to go to/from the client device’s speakers/microphone. On the other hand, the model configuration (system message, max tokens, temperature, etc.) and access to the knowledge base for RAG needs to be handled on the server, since we don’t want the client to have credentials for these resources, and don’t want to require the client to have network line-of-sight to these components. To accomplish this, we introduce a middle tier component that proxies audio traffic, while keeping aspects such as model configuration and function calling entirely on the backend.

These two building blocks work in coordination: the real-time API knows not to move a conversation forward if there are outstanding function calls. When the model needs information from the knowledge base to respond to input, it emits a “search” function call. We turn that function call into an Azure AI Search “hybrid” query (vector + hybrid + reranking), get the content passages that best relate to what the model needs to know, and send it back to the model as the function’s output. Once the model sees that output, it responds via the audio channel, moving the conversation forward.

A critical element in this picture is fast and accurate retrieval. The search call happens between the user turn and the model response in the audio channel, a latency-sensitive point in time. Azure AI Search is the perfect fit for this, with its low latency for vector and hybrid queries and its support for semantic reranking to maximize relevance of responses.

Generating Grounded Responses

Using function calling addresses the question of how to coordinate search queries against the knowledge base, but this inversion of control creates a new problem: we don’t know which of the passages retrieved from the knowledge base were used to ground each response. Typical RAG applications that interact with the model API in terms of text we can ask in the prompt to produce citations with special notation that we can present in the UX appropriately, but when the model is generating audio, we don’t want it to say file names or URLs out loud. Since it’s critical for generative AI applications to be transparent about what grounding data was used to respond to any given input, we need a different mechanism for identifying and showing citations in the user experience.

We also use function calling to accomplish this. We introduce a second tool called “report_grounding”, and as part of the system prompt we include instructions along these lines:

Use the following step-by-step instructions to respond with short and concise answers using a knowledge base:

Step 1 - Always use the 'search' tool to check the knowledge base before answering a question.

Step 2 - Always use the 'report_grounding' tool to report the source of information from the knowledge base.

Step 3 - Produce an answer that's as short as possible. If the answer isn't in the knowledge base, say you don't know.

We experimented with different ways to formulate this prompt and found that explicitly listing this as a step-by-step process is particularly effective.

With these two tools in place, we now have a system that flows audio to the model, enables the model to call back to app logic in the backend both for searching and for telling us which pieces of grounding data was used, and then flows audio back to the client along with extra messages to let the client know about the grounding information (you can see this in the UI as citations to documents that show up as the answer is spoken).

Using any Real-Time API-enabled client

Note that the middle tier completely suppresses tools-related interactions and overrides system configuration options but otherwise maintains the same protocol. This means that any client that works directly against the Azure OpenAI API will “just work” against the real-time middle tier, since the RAG process is entirely encapsulated on the backend.

Creating secure generative AI apps

We’re keeping all configuration elements (system prompt, max tokens, etc.) and all credentials (to access Azure OpenAI, Azure AI Search, etc.) in the backend, securely separated from clients. Furthermore, Azure OpenAI and Azure AI Search include extensive security capabilities to further secure the backend, including network isolation to make the API endpoints of both models and search indexes not reachable through the internet, Entra ID to avoid keys for authentication across services, and options for multiple layers of encryption for the indexed content.


Try it today

The code and data for everything discussed in this blog post is available in this GitHub repo: Azure-Samples/aisearch-openai-rag-audio. You can use it as-is, or you can easily change the data to your own and talk to your data.


The code in the repo above and the description in this blog post in more of a pattern than a specific solution. You’ll need to experiment to get the prompts right, maybe expand the RAG workflow, and certainly assess it for security and AI safety.


To learn more about the Azure OpenAI gpt-4o-realtime-preview model and real-time API you can go here. For Azure AI Search you’ll find plenty of resources here, and the documentation here.


Looking forward to seeing new “talk to your data” scenarios!

Updated Oct 02, 2024

Version 4.0

pablocastro's avatar

15 Comments

  • ppribs's avatar

    Seria possível no futuro a Realtime Audio API se integrar a ferramenta de Script? Assim criando uma versão moderna das ferramentas de assistência remota, se aproveitando de uma chamada via Skype do usuário do computador ou algum técnico autorizado, onde recursos de reconhecimento facial, reconhecimento de número de celular e reconhecimento de voz seriam as formas de autenticação? O computador seria o assistente do usuário ou um tipo de secretária, não substituindo cargos, mas sim permitindo que a pessoa secretária/assistente acompanhe os trabalhos em reuniões, participando mais ativamente ao invés de fazer trabalhos que futuramente seriam possíveis por uma IA. Um exemplo seriam retoque de imagens comandadas por voz, ou pequenas alterações em apresentações, atualizações, correções, busca de arquivos armazenadas localmente no computador, envio de emails, etc. Também facilitaria correção de bugs de sistema, análise do computador para técnicos, onde a IA seria alimentada pelos profissionais mais qualificados da própria Apple, e o técnico assistente apenas faria o trabalho necessário de rodar a análise, entender e explicar ao cliente o resultado. Seria interessante ver esta integração.

  • EricG's avatar

    Hello,

    Thank you very much for this VoiceRAG example.

    I chosen the Dev Container method and tried to deploy to our Azure Container Apps. I got this error: 

    "ERROR: error executing step command 'provision': deployment failed: error deploying infrastructure: deploying to subscription:

    Deployment Error Details:
    ContainerAppOperationError: Failed to provision revision for container app 'capps-backend-6qy2zs4x2hmpo'. Error details: Operation expired.

    "

    when pushing with the "azd up" command. Any suggestion ? Thank you in advance. Eric.

  • ronalgonzalez's avatar

    I have recently downloaded this project and I have installed everything just as you say in your github, but I have a doubt when I do the execution it shows me this error

    Starting backend
    
    INFO:voicerag:Running in development mode, loading from .env file
    ======== Running on http://localhost:8765 ========
    (Press CTRL+C to quit)
    ERROR:aiohttp.server:Error handling request
    Traceback (most recent call last):
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\web_protocol.py", line 452, in _handle_request
        resp = await request_handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\web_app.py", line 543, in _handle
        resp = await handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\app\backend\rtmt.py", line 208, in _websocket_handler    
        await self._forward_messages(ws)
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\app\backend\rtmt.py", line 180, in _forward_messages     
        async with session.ws_connect("/openai/realtime", headers=headers, params=params) as target_ws:
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\client.py", line 1194, in __aenter__
        self._resp = await self._coro
                     ^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\client.py", line 848, in _ws_connect
        raise WSServerHandshakeError(
    aiohttp.client_exceptions.WSServerHandshakeError: 401, message='Invalid response status', url=URL('wss://aihubrealtimem6781383663.openai.azure.com/openai/realtime?api-version=2024-10-01-preview&deployment=gpt-4o-realtime-preview')
    ERROR:aiohttp.server:Error handling request
    Traceback (most recent call last):
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\web_protocol.py", line 452, in _handle_request
        resp = await request_handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\web_app.py", line 543, in _handle
        resp = await handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\app\backend\rtmt.py", line 208, in _websocket_handler    
        await self._forward_messages(ws)
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\app\backend\rtmt.py", line 180, in _forward_messages     
        async with session.ws_connect("/openai/realtime", headers=headers, params=params) as target_ws:
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\client.py", line 1194, in __aenter__
        self._resp = await self._coro
                     ^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\client.py", line 848, in _ws_connect
        raise WSServerHandshakeError(
    aiohttp.client_exceptions.WSServerHandshakeError: 401, message='Invalid response status', url=URL('wss://aihubrealtimem6781383663.openai.azure.com/openai/realtime?api-version=2024-10-01-preview&deployment=gpt-4o-realtime-preview')
    ERROR:aiohttp.server:Error handling request
    Traceback (most recent call last):
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\web_protocol.py", line 452, in _handle_request
        resp = await request_handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\web_app.py", line 543, in _handle
        resp = await handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\app\backend\rtmt.py", line 208, in _websocket_handler    
        await self._forward_messages(ws)
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\app\backend\rtmt.py", line 180, in _forward_messages     
        async with session.ws_connect("/openai/realtime", headers=headers, params=params) as target_ws:
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\client.py", line 1194, in __aenter__
        self._resp = await self._coro
                     ^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\client.py", line 848, in _ws_connect
        raise WSServerHandshakeError(
    aiohttp.client_exceptions.WSServerHandshakeError: 401, message='Invalid response status', url=URL('wss://aihubrealtimem6781383663.openai.azure.com/openai/realtime?api-version=2024-10-01-preview&deployment=gpt-4o-realtime-preview')
    ERROR:aiohttp.server:Error handling request
    Traceback (most recent call last):
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\web_protocol.py", line 452, in _handle_request
        resp = await request_handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\web_app.py", line 543, in _handle
        resp = await handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\app\backend\rtmt.py", line 208, in _websocket_handler    
        await self._forward_messages(ws)
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\app\backend\rtmt.py", line 180, in _forward_messages     
        async with session.ws_connect("/openai/realtime", headers=headers, params=params) as target_ws:
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\client.py", line 1194, in __aenter__
        self._resp = await self._coro
                     ^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\client.py", line 848, in _ws_connect
        raise WSServerHandshakeError(
    aiohttp.client_exceptions.WSServerHandshakeError: 401, message='Invalid response status', url=URL('wss://aihubrealtimem6781383663.openai.azure.com/openai/realtime?api-version=2024-10-01-preview&deployment=gpt-4o-realtime-preview')
    ERROR:aiohttp.server:Error handling request
    Traceback (most recent call last):
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\web_protocol.py", line 452, in _handle_request
        resp = await request_handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\web_app.py", line 543, in _handle
        resp = await handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\app\backend\rtmt.py", line 208, in _websocket_handler    
        await self._forward_messages(ws)
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\app\backend\rtmt.py", line 180, in _forward_messages     
        async with session.ws_connect("/openai/realtime", headers=headers, params=params) as target_ws:
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\client.py", line 1194, in __aenter__
        self._resp = await self._coro
                     ^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\client.py", line 848, in _ws_connect
        raise WSServerHandshakeError(
    aiohttp.client_exceptions.WSServerHandshakeError: 401, message='Invalid response status', url=URL('wss://aihubrealtimem6781383663.openai.azure.com/openai/realtime?api-version=2024-10-01-preview&deployment=gpt-4o-realtime-preview')
    ERROR:aiohttp.server:Error handling request
    Traceback (most recent call last):
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\web_protocol.py", line 452, in _handle_request
        resp = await request_handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\web_app.py", line 543, in _handle
        resp = await handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\app\backend\rtmt.py", line 208, in _websocket_handler    
        await self._forward_messages(ws)
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\app\backend\rtmt.py", line 180, in _forward_messages     
        async with session.ws_connect("/openai/realtime", headers=headers, params=params) as target_ws:
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\client.py", line 1194, in __aenter__
        self._resp = await self._coro
                     ^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\client.py", line 848, in _ws_connect
        raise WSServerHandshakeError(
    aiohttp.client_exceptions.WSServerHandshakeError: 401, message='Invalid response status', url=URL('wss://aihubrealtimem6781383663.openai.azure.com/openai/realtime?api-version=2024-10-01-preview&deployment=gpt-4o-realtime-preview')

    What can I do to solve it?  

  • ronalgonzalez's avatar

    Super, I love this post, I have cloned and followed the following steps but I have encountered a problem, I don't know if you could help me and I would be grateful.

    I have been doing all the necessary steps referenced in the github and this is exactly the data I'm getting from azure AI Studio Deployment 

    https://github.com/Azure-Samples/aisearch-openai-rag-audio

    and I'm getting the error I'm attaching at the end

    Starting backend
    
    INFO:voicerag:Running in development mode, loading from .env file
    INFO:voicerag:Using DefaultAzureCredential
    INFO:azure.identity._credentials.environment:No environment configuration found.
    INFO:azure.identity._credentials.managed_identity:ManagedIdentityCredential will use IMDS
    INFO:azure.core.pipeline.policies.http_logging_policy:Request URL: 'http://169.254.169.254/metadata/identity/oauth2/token?api-version=REDACTED&resource=REDACTED'
    Request method: 'GET'
    Request headers:
        'User-Agent': 'azsdk-python-identity/1.18.0 Python/3.11.9 (Windows-10-10.0.22631-SP0)'
    No body was attached to the request
    INFO:azure.identity._credentials.chained:DefaultAzureCredential acquired a token from AzureCliCredential
    ======== Running on http://localhost:8765 ========
    (Press CTRL+C to quit)
    INFO:aiohttp.access:::1 [01/Nov/2024:10:09:18 -0600] "GET /audio-processor-worklet.js HTTP/1.1" 304 177 "http://localhost:8765/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Safari/537.36 Edg/130.0.0.0"
    INFO:aiohttp.access:::1 [01/Nov/2024:10:09:18 -0600] "GET /audio-playback-worklet.js HTTP/1.1" 304 177 "http://localhost:8765/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Safari/537.36 Edg/130.0.0.0"
    INFO:aiohttp.access:::1 [01/Nov/2024:10:09:31 -0600] "GET / HTTP/1.1" 200 235 "-" "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Mobile Safari/537.36 Edg/130.0.0.0"
    INFO:aiohttp.access:::1 [01/Nov/2024:10:09:31 -0600] "GET /assets/index-MbNbqoGC.css HTTP/1.1" 200 237 "http://localhost:8765/" "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Mobile Safari/537.36 Edg/130.0.0.0"
    INFO:aiohttp.access:::1 [01/Nov/2024:10:09:31 -0600] "GET /assets/index-BrCwpeIS.js HTTP/1.1" 200 253 "http://localhost:8765/" "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Mobile Safari/537.36 Edg/130.0.0.0"
    INFO:aiohttp.access:::1 [01/Nov/2024:10:09:31 -0600] "GET /assets/index-BrCwpeIS.js HTTP/1.1" 200 253 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Safari/537.36 Edg/130.0.0.0"
    INFO:aiohttp.access:::1 [01/Nov/2024:10:09:31 -0600] "GET /favicon.ico HTTP/1.1" 304 178 "http://localhost:8765/" "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Mobile Safari/537.36 Edg/130.0.0.0"
    INFO:aiohttp.access:::1 [01/Nov/2024:10:09:31 -0600] "GET /assets/index-BrCwpeIS.js.map HTTP/1.1" 200 243 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Safari/537.36 Edg/130.0.0.0"
    ERROR:aiohttp.server:Error handling request
    Traceback (most recent call last):
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\web_protocol.py", line 452, in _handle_request
        resp = await request_handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\web_app.py", line 543, in _handle
        resp = await handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\app\backend\rtmt.py", line 208, in _websocket_handler
        await self._forward_messages(ws)
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\app\backend\rtmt.py", line 180, in _forward_messages
        async with session.ws_connect("/openai/realtime", headers=headers, params=params) as target_ws:
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\client.py", line 1194, in __aenter__
        self._resp = await self._coro
                     ^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\client.py", line 848, in _ws_connect
        raise WSServerHandshakeError(
    aiohttp.client_exceptions.WSServerHandshakeError: 401, message='Invalid response status', url=URL('wss://aihubrealtimem6781383663.openai.azure.com/openai/realtime?api-version=2024-10-01-preview&deployment=gpt-4o-realtime-preview')
    ERROR:aiohttp.server:Error handling request
    Traceback (most recent call last):
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\web_protocol.py", line 452, in _handle_request
        resp = await request_handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\web_app.py", line 543, in _handle
        resp = await handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\app\backend\rtmt.py", line 208, in _websocket_handler
        await self._forward_messages(ws)
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\app\backend\rtmt.py", line 180, in _forward_messages
        async with session.ws_connect("/openai/realtime", headers=headers, params=params) as target_ws:
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\client.py", line 1194, in __aenter__
        self._resp = await self._coro
                     ^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\client.py", line 848, in _ws_connect
        raise WSServerHandshakeError(
    aiohttp.client_exceptions.WSServerHandshakeError: 401, message='Invalid response status', url=URL('wss://aihubrealtimem6781383663.openai.azure.com/openai/realtime?api-version=2024-10-01-preview&deployment=gpt-4o-realtime-preview')
    ERROR:aiohttp.server:Error handling request
    Traceback (most recent call last):
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\web_protocol.py", line 452, in _handle_request
        resp = await request_handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\web_app.py", line 543, in _handle
        resp = await handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\app\backend\rtmt.py", line 208, in _websocket_handler
        await self._forward_messages(ws)
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\app\backend\rtmt.py", line 180, in _forward_messages
        async with session.ws_connect("/openai/realtime", headers=headers, params=params) as target_ws:
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\client.py", line 1194, in __aenter__
        self._resp = await self._coro
                     ^^^^^^^^^^^^^^^^
      File "C:\Users\RonalGonzalez\Documents\projects\web-development\nodejs\TrackingwithSocketsio\aisearch-openai-rag-audio\.venv\Lib\site-packages\aiohttp\client.py", line 848, in _ws_connect
        raise WSServerHandshakeError(
    aiohttp.client_exceptions.WSServerHandshakeError: 401, message='Invalid response status', url=URL('wss://aihubrealtimem6781383663.openai.azure.com/openai/realtime?api-version=2024-10-01-preview&deployment=gpt-4o-realtime-preview')
    ERROR:aiohttp.server:Error handling request
  • FatPopcorn's avatar

    I'm attempting to run on CodeSpace but running into error below, has anyone encountered this?

    InsufficientQuota: This operation require 1 new capacity in quota Requests Per Minute - gpt-4o-realtime-preview - GlobalStandard, which is bigger than the current available capacity 0. The current quota usage is 1 and the quota limit is 1 for quota Requests Per Minute - gpt-4o-realtime-preview - GlobalStandard.

    • StephaneR13090's avatar

      You need to ask for additional quota. In my case, I got an answer the following day.

  • LeonH765's avatar

    How about C# version of this project with ASP.NET Core?

  • kathyhurchla's avatar

    Great video and post, thank you

  • juanemartinez's avatar

    Has anyone been able to run it without errors in codespaces? If this is how they did it, it gives me the following error

    N

    o directory exists at '/workspaces/aisearch-openai-rag-audio/app/backend/static'
    ValueError: Not a directory

    The above exception was the direct cause of the following exception:

    File "/workspaces/aisearch-openai-rag-audio/app/backend/app.py", line 37, in <module>
    app.router.add_static('/', path='./static', name='static')
    ValueError: No directory exists at '/workspaces/aisearch-openai-rag-audio/app/backend/static'

  • OlafReger's avatar

    The real question is, was it actually pablocastro presenting...

  • andreaskopp's avatar

    pablocastro - Can you please add miniconda or conda to the devcontainer for env management?
    It was included in the initial devcontainer but was dropped in the recent update. Thank you!