AI Voice Chat using Webhook, Memory Manager, OpenAI, Google Gemini & ElevenLabs

Create an AI voice chat experience with memory. OpenAI transcribes speech, Google Gemini generates responses, and ElevenLabs synthesizes audio. Ideal for interactive customer support, educational tools, or personalized virtual assistants. This workflow streamlines conversational AI development.

15 nodeswebhook trigger141 views0 copiesAI

OpenAI

Workflow JSON

{"id": "TtoDcjgthgA4NTkU", "meta": {"instanceId": "fb261afc5089eae952e09babdadd9983000b3d863639802f6ded8c5be2e40067", "templateCredsSetupCompleted": true}, "name": "AI Voice Chat using Webhook, Memory Manager, OpenAI, Google Gemini & ElevenLabs", "tags": [{"id": "mqOrNvCDgQLzPA2x", "name": "Workflows", "createdAt": "2024-08-07T14:18:53.614Z", "updatedAt": "2024-08-07T14:18:53.614Z"}], "nodes": [{"id": "86cbf150-df4f-42f7-b7b3-e03c32e6f23c", "name": "Get Chat", "type": "@n8n/n8n-nodes-langchain.memoryManager", "position": [1700, -400], "parameters": {"options": {}}, "typeVersion": 1, "alwaysOutputData": true}, {"id": "a9153a24-e902-4f29-9b83-447317ce3119", "name": "Insert Chat", "type": "@n8n/n8n-nodes-langchain.memoryManager", "position": [2540, -400], "parameters": {"mode": "insert", "messages": {"messageValues": [{"type": "user", "message": "={{ $('OpenAI - Speech to Text').item.json[\"text\"] }}"}, {"type": "ai", "message": "={{ $json.text }}"}]}}, "typeVersion": 1, "alwaysOutputData": true}, {"id": "f5c272d4-248b-45a5-87b5-eb659a865d05", "name": "Sticky Note5", "type": "n8n-nodes-base.stickyNote", "position": [1664, -491], "parameters": {"color": 6, "width": 486.4746124819703, "height": 238.4911357933579, "content": "## Get Context"}, "typeVersion": 1}, {"id": "32ad17ca-0045-487d-9387-71c2e73629d4", "name": "Sticky Note", "type": "n8n-nodes-base.stickyNote", "position": [2510, -489], "parameters": {"color": 6, "width": 321.2536584847704, "height": 231.05945912581728, "content": "## Save Context"}, "typeVersion": 1}, {"id": "17ae4f1a-6192-4c52-8157-3cb47b37e0fb", "name": "Aggregate", "type": "n8n-nodes-base.aggregate", "position": [2020, -400], "parameters": {"options": {}, "aggregate": "aggregateAllItemData", "destinationFieldName": "context"}, "typeVersion": 1, "alwaysOutputData": true}, {"id": "00b3081e-fbcd-489b-b45a-4e847c346594", "name": "Window Buffer Memory", "type": "@n8n/n8n-nodes-langchain.memoryBufferWindow", "position": [2080, -100], "parameters": {"sessionKey": "test-0dacb3b5-4bcd-47dd-8456-dcfd8c258204", "sessionIdType": "customKey"}, "typeVersion": 1.2}, {"id": "55ca2790-e905-414a-a9f6-7d88a9e5807d", "name": "Google Gemini Chat Model", "type": "@n8n/n8n-nodes-langchain.lmChatGoogleGemini", "position": [2220, -100], "parameters": {"options": {}, "modelName": "models/gemini-1.5-flash"}, "credentials": {"googlePalmApi": {"id": "", "name": "[Your googlePalmApi]"}}, "typeVersion": 1}, {"id": "e8b3433f-b205-404c-9f05-504556d6b6dd", "name": "Respond to Webhook", "type": "n8n-nodes-base.respondToWebhook", "position": [3560, -400], "parameters": {"options": {}, "respondWith": "binary"}, "typeVersion": 1.1}, {"id": "de296743-5ac7-454b-bf3a-d020cc024511", "name": "ElevenLabs - Generate Audio", "type": "n8n-nodes-base.httpRequest", "position": [3240, -400], "parameters": {"url": "=https://api.elevenlabs.io/v1/text-to-speech/{{voice id}}", "method": "POST", "options": {}, "sendBody": true, "sendHeaders": true, "authentication": "genericCredentialType", "bodyParameters": {"parameters": [{"name": "text", "value": "={{ $('Basic LLM Chain').item.json.text }}"}]}, "genericAuthType": "httpCustomAuth", "headerParameters": {"parameters": [{"name": "Content-Type", "value": "application/json"}]}}, "credentials": {"httpCustomAuth": {"id": "", "name": "[Your httpCustomAuth]"}}, "typeVersion": 4.2}, {"id": "214e15f2-8a16-4598-b4ac-9fc2ec6545e6", "name": "Sticky Note2", "type": "n8n-nodes-base.stickyNote", "position": [3040, -560], "parameters": {"width": 468.73250812192407, "height": 843.7602354099661, "content": "* ### For the Text-to-Speech part, we'll use ElevenLabs.io, which is free and offers a variety of voices to choose from. However, you can also use the OpenAI `\"Generate audio\"` node instead.\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n* ### Since there is no pre-built node for `\"ElevenLabs\"` in n8n, we'll connect to it through its API using the \"HTTP Request\" node.\n\n## Prerequisites:\n* ### `\"ElevenLabs API Key\"` (you can obtain it from their website).\n* ### `\"Voice ID\"` (you can also get it from ElevenLabs' \"Voice Library\").\n## Setup\n* ### In the URL parameter, replace \"{{voice id}}\" at the end of the URL with the Voice ID you obtained from ElevenLabs.io.\n* ### To set up your API Key, add custom authentication and include the following `JSON` with your acual ElevenLabs API Key:\n```json\n{\n \"headers\": {\n \"xi-api-key\": \"put-your-API-Key-here\"\n }\n}\n```"}, "typeVersion": 1}, {"id": "94ad934c-4a13-47b1-83a5-76fab43b3a47", "name": "Sticky Note1", "type": "n8n-nodes-base.stickyNote", "position": [1663, -598], "parameters": {"color": 6, "width": 487.4293487597613, "height": 91.01435855269375, "content": "### The \"Get Chat,\" \"Insert Chat,\" and \"Window Buffer Memory\" nodes will help the LLM model maintain context throughout the conversation."}, "typeVersion": 1}, {"id": "0a96f48d-0d8b-4240-9eab-a681bfd4c8b5", "name": "Limit", "type": "n8n-nodes-base.limit", "position": [2900, -400], "parameters": {}, "typeVersion": 1}, {"id": "9a5d4ddb-6403-4758-858e-9fbe10c421a9", "name": "Basic LLM Chain", "type": "@n8n/n8n-nodes-langchain.chainLlm", "position": [2200, -400], "parameters": {"text": "={{ $('OpenAI - Speech to Text').item.json[\"text\"] }}", "messages": {"messageValues": [{"type": "AIMessagePromptTemplate", "message": "=To maintain context and fully understand the user's question, always review the previous conversation between you and him before providing an answer.\nThis is the previous conversation:\n{{ $('Aggregate').item.json[\"context\"].map(m => `\nHuman: ${m.human || 'undefined'}\nAI Assistant: ${m.ai || 'undefined'}\n`).join('') }}"}]}, "promptType": "define"}, "typeVersion": 1.4}, {"id": "f2f99895-9678-41b8-ad28-db40e1e23dc0", "name": "Webhook", "type": "n8n-nodes-base.webhook", "position": [1320, -400], "webhookId": "e9f611eb-a8dd-4520-8d24-9f36deaca528", "parameters": {"path": "voice_message", "options": {}, "httpMethod": "POST", "responseMode": "responseNode"}, "typeVersion": 2}, {"id": "d9a5fb04-4c02-4da4-b690-7b0ecd0ae052", "name": "OpenAI - Speech to Text", "type": "@n8n/n8n-nodes-langchain.openAi", "position": [1500, -400], "parameters": {"options": {}, "resource": "audio", "operation": "transcribe", "binaryPropertyName": "voice_message"}, "credentials": {"openAiApi": {"id": "", "name": "[Your openAiApi]"}}, "typeVersion": 1.3}], "active": true, "pinData": {}, "settings": {"callerPolicy": "workflowsFromSameOwner", "executionOrder": "v1", "saveManualExecutions": true}, "versionId": "fe5792ca-03d7-4cdd-96db-20f4cd479c7e", "connections": {"Limit": {"main": [[{"node": "ElevenLabs - Generate Audio", "type": "main", "index": 0}]]}, "Webhook": {"main": [[{"node": "OpenAI - Speech to Text", "type": "main", "index": 0}]]}, "Get Chat": {"main": [[{"node": "Aggregate", "type": "main", "index": 0}]]}, "Aggregate": {"main": [[{"node": "Basic LLM Chain", "type": "main", "index": 0}]]}, "Insert Chat": {"main": [[{"node": "Limit", "type": "main", "index": 0}]]}, "Basic LLM Chain": {"main": [[{"node": "Insert Chat", "type": "main", "index": 0}]]}, "Window Buffer Memory": {"ai_memory": [[{"node": "Insert Chat", "type": "ai_memory", "index": 0}, {"node": "Get Chat", "type": "ai_memory", "index": 0}]]}, "OpenAI - Speech to Text": {"main": [[{"node": "Get Chat", "type": "main", "index": 0}]]}, "Google Gemini Chat Model": {"ai_languageModel": [[{"node": "Basic LLM Chain", "type": "ai_languageModel", "index": 0}]]}, "ElevenLabs - Generate Audio": {"main": [[{"node": "Respond to Webhook", "type": "main", "index": 0}]]}}}

How to Import This Workflow

1Copy the workflow JSON above using the Copy Workflow JSON button.
2Open your n8n instance and go to Workflows.
3Click Import from JSON and paste the copied workflow.

Don't have an n8n instance? Start your free trial at n8nautomation.cloud

Related Templates

Text to Speech (OpenAI)

Converts text into natural-sounding speech using OpenAI's Text-to-Speech API. It sends your input text to OpenAI and receives an audio file in return. This is useful for creating audio versions of articles, generating voiceovers for videos, or providing accessibility features for web content. Quickly transform written content into engaging audio.

8 nodes

LangChain - Example - Code Node Example

Explore a basic LangChain agent that answers questions using a custom tool. This workflow connects n8n's AI nodes and custom code nodes to OpenAI for language model interactions. It's useful for developers building custom AI assistants or researchers experimenting with agentic workflows. This saves development time by providing a ready-to-use example of a LangChain agent.

10 nodes

AI-Powered Candidate Shortlisting Automation for ERPNext

Automate AI-powered candidate shortlisting for ERPNext job applications. This workflow connects ERPNext, Google Gemini, WhatsApp, and Outlook to process resumes, evaluate candidates, and communicate outcomes. Recruiters and HR departments can use this to efficiently screen applicants, automatically reject unqualified candidates, and send acceptance notifications. It significantly reduces manual review time and streamlines the hiring process.

39 nodes

Ready to automate with n8n?

Get affordable managed n8n hosting with 24/7 support.