AI Voice Chat using Webhook, Memory Manager, OpenAI, Google Gemini & ElevenLabs
Workflow JSON
{"id": "TtoDcjgthgA4NTkU", "meta": {"instanceId": "fb261afc5089eae952e09babdadd9983000b3d863639802f6ded8c5be2e40067", "templateCredsSetupCompleted": true}, "name": "AI Voice Chat using Webhook, Memory Manager, OpenAI, Google Gemini & ElevenLabs", "tags": [{"id": "mqOrNvCDgQLzPA2x", "name": "Workflows", "createdAt": "2024-08-07T14:18:53.614Z", "updatedAt": "2024-08-07T14:18:53.614Z"}], "nodes": [{"id": "86cbf150-df4f-42f7-b7b3-e03c32e6f23c", "name": "Get Chat", "type": "@n8n/n8n-nodes-langchain.memoryManager", "position": [1700, -400], "parameters": {"options": {}}, "typeVersion": 1, "alwaysOutputData": true}, {"id": "a9153a24-e902-4f29-9b83-447317ce3119", "name": "Insert Chat", "type": "@n8n/n8n-nodes-langchain.memoryManager", "position": [2540, -400], "parameters": {"mode": "insert", "messages": {"messageValues": [{"type": "user", "message": "={{ $('OpenAI - Speech to Text').item.json[\"text\"] }}"}, {"type": "ai", "message": "={{ $json.text }}"}]}}, "typeVersion": 1, "alwaysOutputData": true}, {"id": "f5c272d4-248b-45a5-87b5-eb659a865d05", "name": "Sticky Note5", "type": "n8n-nodes-base.stickyNote", "position": [1664, -491], "parameters": {"color": 6, "width": 486.4746124819703, "height": 238.4911357933579, "content": "## Get Context"}, "typeVersion": 1}, {"id": "32ad17ca-0045-487d-9387-71c2e73629d4", "name": "Sticky Note", "type": "n8n-nodes-base.stickyNote", "position": [2510, -489], "parameters": {"color": 6, "width": 321.2536584847704, "height": 231.05945912581728, "content": "## Save Context"}, "typeVersion": 1}, {"id": "17ae4f1a-6192-4c52-8157-3cb47b37e0fb", "name": "Aggregate", "type": "n8n-nodes-base.aggregate", "position": [2020, -400], "parameters": {"options": {}, "aggregate": "aggregateAllItemData", "destinationFieldName": "context"}, "typeVersion": 1, "alwaysOutputData": true}, {"id": "00b3081e-fbcd-489b-b45a-4e847c346594", "name": "Window Buffer Memory", "type": "@n8n/n8n-nodes-langchain.memoryBufferWindow", "position": [2080, -100], "parameters": {"sessionKey": "test-0dacb3b5-4bcd-47dd-8456-dcfd8c258204", "sessionIdType": "customKey"}, "typeVersion": 1.2}, {"id": "55ca2790-e905-414a-a9f6-7d88a9e5807d", "name": "Google Gemini Chat Model", "type": "@n8n/n8n-nodes-langchain.lmChatGoogleGemini", "position": [2220, -100], "parameters": {"options": {}, "modelName": "models/gemini-1.5-flash"}, "credentials": {"googlePalmApi": {"id": "", "name": "[Your googlePalmApi]"}}, "typeVersion": 1}, {"id": "e8b3433f-b205-404c-9f05-504556d6b6dd", "name": "Respond to Webhook", "type": "n8n-nodes-base.respondToWebhook", "position": [3560, -400], "parameters": {"options": {}, "respondWith": "binary"}, "typeVersion": 1.1}, {"id": "de296743-5ac7-454b-bf3a-d020cc024511", "name": "ElevenLabs - Generate Audio", "type": "n8n-nodes-base.httpRequest", "position": [3240, -400], "parameters": {"url": "=https://api.elevenlabs.io/v1/text-to-speech/{{voice id}}", "method": "POST", "options": {}, "sendBody": true, "sendHeaders": true, "authentication": "genericCredentialType", "bodyParameters": {"parameters": [{"name": "text", "value": "={{ $('Basic LLM Chain').item.json.text }}"}]}, "genericAuthType": "httpCustomAuth", "headerParameters": {"parameters": [{"name": "Content-Type", "value": "application/json"}]}}, "credentials": {"httpCustomAuth": {"id": "", "name": "[Your httpCustomAuth]"}}, "typeVersion": 4.2}, {"id": "214e15f2-8a16-4598-b4ac-9fc2ec6545e6", "name": "Sticky Note2", "type": "n8n-nodes-base.stickyNote", "position": [3040, -560], "parameters": {"width": 468.73250812192407, "height": 843.7602354099661, "content": "* ### For the Text-to-Speech part, we'll use ElevenLabs.io, which is free and offers a variety of voices to choose from. However, you can also use the OpenAI `\"Generate audio\"` node instead.\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n* ### Since there is no pre-built node for `\"ElevenLabs\"` in n8n, we'll connect to it through its API using the \"HTTP Request\" node.\n\n## Prerequisites:\n* ### `\"ElevenLabs API Key\"` (you can obtain it from their website).\n* ### `\"Voice ID\"` (you can also get it from ElevenLabs' \"Voice Library\").\n## Setup\n* ### In the URL parameter, replace \"{{voice id}}\" at the end of the URL with the Voice ID you obtained from ElevenLabs.io.\n* ### To set up your API Key, add custom authentication and include the following `JSON` with your acual ElevenLabs API Key:\n```json\n{\n \"headers\": {\n \"xi-api-key\": \"put-your-API-Key-here\"\n }\n}\n```"}, "typeVersion": 1}, {"id": "94ad934c-4a13-47b1-83a5-76fab43b3a47", "name": "Sticky Note1", "type": "n8n-nodes-base.stickyNote", "position": [1663, -598], "parameters": {"color": 6, "width": 487.4293487597613, "height": 91.01435855269375, "content": "### The \"Get Chat,\" \"Insert Chat,\" and \"Window Buffer Memory\" nodes will help the LLM model maintain context throughout the conversation."}, "typeVersion": 1}, {"id": "0a96f48d-0d8b-4240-9eab-a681bfd4c8b5", "name": "Limit", "type": "n8n-nodes-base.limit", "position": [2900, -400], "parameters": {}, "typeVersion": 1}, {"id": "9a5d4ddb-6403-4758-858e-9fbe10c421a9", "name": "Basic LLM Chain", "type": "@n8n/n8n-nodes-langchain.chainLlm", "position": [2200, -400], "parameters": {"text": "={{ $('OpenAI - Speech to Text').item.json[\"text\"] }}", "messages": {"messageValues": [{"type": "AIMessagePromptTemplate", "message": "=To maintain context and fully understand the user's question, always review the previous conversation between you and him before providing an answer.\nThis is the previous conversation:\n{{ $('Aggregate').item.json[\"context\"].map(m => `\nHuman: ${m.human || 'undefined'}\nAI Assistant: ${m.ai || 'undefined'}\n`).join('') }}"}]}, "promptType": "define"}, "typeVersion": 1.4}, {"id": "f2f99895-9678-41b8-ad28-db40e1e23dc0", "name": "Webhook", "type": "n8n-nodes-base.webhook", "position": [1320, -400], "webhookId": "e9f611eb-a8dd-4520-8d24-9f36deaca528", "parameters": {"path": "voice_message", "options": {}, "httpMethod": "POST", "responseMode": "responseNode"}, "typeVersion": 2}, {"id": "d9a5fb04-4c02-4da4-b690-7b0ecd0ae052", "name": "OpenAI - Speech to Text", "type": "@n8n/n8n-nodes-langchain.openAi", "position": [1500, -400], "parameters": {"options": {}, "resource": "audio", "operation": "transcribe", "binaryPropertyName": "voice_message"}, "credentials": {"openAiApi": {"id": "", "name": "[Your openAiApi]"}}, "typeVersion": 1.3}], "active": true, "pinData": {}, "settings": {"callerPolicy": "workflowsFromSameOwner", "executionOrder": "v1", "saveManualExecutions": true}, "versionId": "fe5792ca-03d7-4cdd-96db-20f4cd479c7e", "connections": {"Limit": {"main": [[{"node": "ElevenLabs - Generate Audio", "type": "main", "index": 0}]]}, "Webhook": {"main": [[{"node": "OpenAI - Speech to Text", "type": "main", "index": 0}]]}, "Get Chat": {"main": [[{"node": "Aggregate", "type": "main", "index": 0}]]}, "Aggregate": {"main": [[{"node": "Basic LLM Chain", "type": "main", "index": 0}]]}, "Insert Chat": {"main": [[{"node": "Limit", "type": "main", "index": 0}]]}, "Basic LLM Chain": {"main": [[{"node": "Insert Chat", "type": "main", "index": 0}]]}, "Window Buffer Memory": {"ai_memory": [[{"node": "Insert Chat", "type": "ai_memory", "index": 0}, {"node": "Get Chat", "type": "ai_memory", "index": 0}]]}, "OpenAI - Speech to Text": {"main": [[{"node": "Get Chat", "type": "main", "index": 0}]]}, "Google Gemini Chat Model": {"ai_languageModel": [[{"node": "Basic LLM Chain", "type": "ai_languageModel", "index": 0}]]}, "ElevenLabs - Generate Audio": {"main": [[{"node": "Respond to Webhook", "type": "main", "index": 0}]]}}}How to Import This Workflow
- 1Copy the workflow JSON above using the Copy Workflow JSON button.
- 2Open your n8n instance and go to Workflows.
- 3Click Import from JSON and paste the copied workflow.
Don't have an n8n instance? Start your free trial at n8nautomation.cloud
Related Templates
Text to Speech (OpenAI)
Converts text into natural-sounding speech using OpenAI's Text-to-Speech API. It sends your input text to OpenAI and receives an audio file in return. This is useful for creating audio versions of articles, generating voiceovers for videos, or providing accessibility features for web content. Quickly transform written content into engaging audio.
Automate Customer Support Issue Resolution using AI Text Classifier
Automate the resolution of customer support issues by classifying their state and applying AI-driven actions. This workflow connects Jira for issue management, OpenAI for AI classification and response generation, and Slack for notifications. Support teams can use this to automatically close resolved tickets, remind customers about open issues, or escalate complex cases.
AI-Powered Candidate Shortlisting Automation for ERPNext
Automate AI-powered candidate shortlisting for ERPNext job applications. This workflow connects ERPNext, Google Gemini, WhatsApp, and Outlook to process resumes, evaluate candidates, and communicate outcomes. Recruiters and HR departments can use this to efficiently screen applicants, automatically reject unqualified candidates, and send acceptance notifications. It significantly reduces manual review time and streamlines the hiring process.