Make OpenAI Citation for File Retrieval RAG
Generate accurate and verifiable citations from your OpenAI Assistant's file retrieval responses with this n8n workflow. This powerful automation connects your OpenAI Assistant, specifically utilizing its Vector Store capabilities, with n8n's HTTP Request nodes to extract and process thread content. It meticulously parses individual messages, identifies citations, and then uses HTTP Requests to retrieve the corresponding file names from file IDs, ensuring proper attribution for your RAG (Retrieval Augmented Generation) outputs. This workflow is ideal for researchers, content creators, and developers who need to ensure the factual integrity and source transparency of AI-generated text, solving the common problem of unverified AI responses by providing direct links to the source documents. By automating the citation generation process, it significantly reduces manual effort in verifying information, enhances the reliability of your AI applications, and saves valuable time in content review and fact-checking.
Workflow JSON
{"id": "5NAbfX550LJsfz6f", "meta": {"instanceId": "00493e38fecfc163cb182114bc2fab90114038eb9aad665a7a752d076920d3d5", "templateCredsSetupCompleted": true}, "name": "Make OpenAI Citation for File Retrieval RAG", "tags": [{"id": "urxRtGxxLObZWPvX", "name": "sample", "createdAt": "2024-09-13T02:43:13.014Z", "updatedAt": "2024-09-13T02:43:13.014Z"}, {"id": "nMXS3c9l1WqDwWF5", "name": "assist", "createdAt": "2024-12-23T16:09:38.737Z", "updatedAt": "2024-12-23T16:09:38.737Z"}], "nodes": [{"id": "b9033511-3421-467a-9bfa-73af01b99c4f", "name": "Aggregate", "type": "n8n-nodes-base.aggregate", "position": [740, 120], "parameters": {"options": {}, "aggregate": "aggregateAllItemData"}, "typeVersion": 1, "alwaysOutputData": true}, {"id": "a61dd9d3-4faa-4878-a6f3-ba8277279002", "name": "Window Buffer Memory", "type": "@n8n/n8n-nodes-langchain.memoryBufferWindow", "position": [980, -320], "parameters": {}, "typeVersion": 1.3}, {"id": "2daabca5-37ec-4cad-9157-29926367e1a7", "name": "Sticky Note4", "type": "n8n-nodes-base.stickyNote", "position": [220, 320], "parameters": {"color": 3, "width": 840, "height": 80, "content": "## Within N8N, there will be a chat button to test"}, "typeVersion": 1}, {"id": "bf4485b1-cd94-41c8-a183-bf1b785f2761", "name": "Sticky Note", "type": "n8n-nodes-base.stickyNote", "position": [-440, -520], "parameters": {"color": 5, "width": 500, "height": 720, "content": "## Make OpenAI Citation for File Retrieval RAG\n\n## Use case\n\nIn this example, we will ensure that all texts from the OpenAI assistant search for citations and sources in the vector store files. We can also format the output for Markdown or HTML tags.\n\nThis is necessary because the assistant sometimes generates strange characters, and we can also use dynamic references such as citations 1, 2, 3, for example.\n\n## What this workflow does\n\nIn this workflow, we will use an OpenAI assistant created within their interface, equipped with a vector store containing some files for file retrieval.\n\nThe assistant will perform the file search within the OpenAI infrastructure and will return the content with citations.\n\n- We will make an HTTP request to retrieve all the details we need to format the text output.\n\n## Setup\n\nInsert an OpenAI Key\n\n## How to adjust it to your needs\n\nAt the end of the workflow, we have a block of code that will format the output, and there we can add Markdown tags to create links. Optionally, we can transform the Markdown formatting into HTML.\n\n\nby Davi Saranszky Mesquita\nhttps://www.linkedin.com/in/mesquitadavi/"}, "typeVersion": 1}, {"id": "539a4e40-9745-4a26-aba8-2cc2b0dd6364", "name": "Create a simple Trigger to have the Chat button within N8N", "type": "@n8n/n8n-nodes-langchain.chatTrigger", "notes": "https://www.npmjs.com/package/@n8n/chat", "position": [260, -520], "webhookId": "8ccaa299-6f99-427b-9356-e783893a3d0c", "parameters": {"options": {}}, "notesInFlow": true, "typeVersion": 1.1}, {"id": "aa5b2951-df32-43ac-9939-83b02d818e73", "name": "OpenAI Assistant with Vector Store", "type": "@n8n/n8n-nodes-langchain.openAi", "position": [580, -520], "parameters": {"options": {"preserveOriginalTools": false}, "resource": "assistant", "assistantId": {"__rl": true, "mode": "list", "value": "asst_QAfdobVCVCMJz8LmaEC7nlId", "cachedResultName": "Teste"}}, "credentials": {"openAiApi": {"id": "", "name": "[Your openAiApi]"}}, "typeVersion": 1.7}, {"id": "1817b673-6cb3-49aa-9f38-a5876eb0e6fa", "name": "Sticky Note1", "type": "n8n-nodes-base.stickyNote", "position": [560, -680], "parameters": {"width": 300, "content": "## Setup\n\n- Configure OpenAI Key\n\n### In this step, we will use an assistant created within the OpenAI platform that contains a vector store a.k.a file retrieval"}, "typeVersion": 1}, {"id": "16429226-e850-4698-b419-fd9805a03fb7", "name": "Get ALL Thread Content", "type": "n8n-nodes-base.httpRequest", "position": [1260, -520], "parameters": {"url": "=https://api.openai.com/v1/threads/{{ $json.threadId }}/messages", "options": {}, "sendHeaders": true, "authentication": "predefinedCredentialType", "headerParameters": {"parameters": [{"name": "OpenAI-Beta", "value": "assistants=v2"}]}, "nodeCredentialType": "openAiApi"}, "credentials": {"openAiApi": {"id": "", "name": "[Your openAiApi]"}}, "typeVersion": 4.2, "alwaysOutputData": true}, {"id": "e8c88b08-5be2-4f7e-8b17-8cf804b3fe9f", "name": "Sticky Note2", "type": "n8n-nodes-base.stickyNote", "position": [1160, -620], "parameters": {"content": "### Retrieving all thread content is necessary because the OpenAI tool does not retrieve all citations upon request."}, "typeVersion": 1}, {"id": "0f51e09f-2782-4e2d-b797-d4d58fcabdaf", "name": "Split all message iterations from a thread", "type": "n8n-nodes-base.splitOut", "position": [220, -300], "parameters": {"options": {}, "fieldToSplitOut": "data"}, "typeVersion": 1, "alwaysOutputData": true}, {"id": "4d569993-1ce3-4b32-beaf-382feac25da9", "name": "Split all content from a single message", "type": "n8n-nodes-base.splitOut", "position": [460, -300], "parameters": {"options": {}, "fieldToSplitOut": "content"}, "typeVersion": 1, "alwaysOutputData": true}, {"id": "999e1c2b-1927-4483-aac1-6e8903f7ed25", "name": "Split all citations from a single message", "type": "n8n-nodes-base.splitOut", "position": [700, -300], "parameters": {"options": {}, "fieldToSplitOut": "text.annotations"}, "typeVersion": 1, "alwaysOutputData": true}, {"id": "98af62f5-adb0-4e07-a146-fc2f13b851ce", "name": "Retrieve file name from a file ID", "type": "n8n-nodes-base.httpRequest", "onError": "continueRegularOutput", "position": [220, 120], "parameters": {"url": "=https://api.openai.com/v1/files/{{ $json.file_citation.file_id }}", "options": {}, "sendQuery": true, "authentication": "predefinedCredentialType", "queryParameters": {"parameters": [{"name": "limit", "value": "1"}]}, "nodeCredentialType": "openAiApi"}, "credentials": {"openAiApi": {"id": "", "name": "[Your openAiApi]"}}, "typeVersion": 4.2, "alwaysOutputData": true}, {"id": "b11f0d3d-bdc4-4845-b14b-d0b0de214f01", "name": "Regularize output", "type": "n8n-nodes-base.set", "position": [480, 120], "parameters": {"options": {}, "assignments": {"assignments": [{"id": "2dcaafee-5037-4a97-942a-bcdd02bc2ad9", "name": "id", "type": "string", "value": "={{ $json.id }}"}, {"id": "b63f967d-ceea-4aa8-98b9-91f5ab21bfe8", "name": "filename", "type": "string", "value": "={{ $json.filename }}"}, {"id": "f611e749-054a-441d-8610-df8ba42de2e1", "name": "text", "type": "string", "value": "={{ $('Split all citations from a single message').item.json.text }}"}]}}, "typeVersion": 3.4, "alwaysOutputData": true}, {"id": "0e999a0e-76ed-4897-989b-228f075e9bfb", "name": "Sticky Note3", "type": "n8n-nodes-base.stickyNote", "position": [440, -60], "parameters": {"width": 200, "height": 220, "content": "### A file retrieval request contains a lot of information, and we want only the text that will be substituted and the file name.\n\n- id\n- filename\n- text\n"}, "typeVersion": 1}, {"id": "53c79a6c-7543-435f-b40e-966dff0904d4", "name": "Sticky Note5", "type": "n8n-nodes-base.stickyNote", "position": [700, -60], "parameters": {"width": 200, "height": 220, "content": "### With the last three splits, we may have many citations and texts to substitute. By doing an aggregation, it will be possible to handle everything as a single request."}, "typeVersion": 1}, {"id": "381fb6d6-64fc-4668-9d3c-98aaa43a45ca", "name": "Sticky Note6", "type": "n8n-nodes-base.stickyNote", "position": [960, -60], "parameters": {"height": 220, "content": "### This simple code will take all the previous files and citations and alter the original text, formatting the output. In this way, we can use Markdown tags to create links, or if you prefer, we can add an HTML transformation node."}, "typeVersion": 1}, {"id": "d0cbb943-57ab-4850-8370-1625610a852a", "name": "Optional Markdown to HTML", "type": "n8n-nodes-base.markdown", "disabled": true, "position": [1220, 120], "parameters": {"html": "={{ $json.output }}", "options": {}, "destinationKey": "output"}, "typeVersion": 1}, {"id": "589e2418-5dec-47d0-ba08-420d84f09da7", "name": "Finnaly format the output", "type": "n8n-nodes-base.code", "position": [980, 120], "parameters": {"mode": "runOnceForEachItem", "jsCode": "let saida = $('OpenAI Assistant with Vector Store').item.json.output;\n\nfor (let i of $input.item.json.data) {\n saida = saida.replaceAll(i.text, \" _(\"+ i.filename+\")_ \");\n}\n\n$input.item.json.output = saida;\nreturn $input.item;"}, "typeVersion": 2}], "active": false, "pinData": {}, "settings": {"executionOrder": "v1"}, "versionId": "0e621a5a-d99d-4db3-9ae4-ea98c31467e9", "connections": {"Aggregate": {"main": [[{"node": "Finnaly format the output", "type": "main", "index": 0}]]}, "Regularize output": {"main": [[{"node": "Aggregate", "type": "main", "index": 0}]]}, "Window Buffer Memory": {"ai_memory": [[{"node": "OpenAI Assistant with Vector Store", "type": "ai_memory", "index": 0}]]}, "Get ALL Thread Content": {"main": [[{"node": "Split all message iterations from a thread", "type": "main", "index": 0}]]}, "Finnaly format the output": {"main": [[{"node": "Optional Markdown to HTML", "type": "main", "index": 0}]]}, "Retrieve file name from a file ID": {"main": [[{"node": "Regularize output", "type": "main", "index": 0}]]}, "OpenAI Assistant with Vector Store": {"main": [[{"node": "Get ALL Thread Content", "type": "main", "index": 0}]]}, "Split all content from a single message": {"main": [[{"node": "Split all citations from a single message", "type": "main", "index": 0}]]}, "Split all citations from a single message": {"main": [[{"node": "Retrieve file name from a file ID", "type": "main", "index": 0}]]}, "Split all message iterations from a thread": {"main": [[{"node": "Split all content from a single message", "type": "main", "index": 0}]]}, "Create a simple Trigger to have the Chat button within N8N": {"main": [[{"node": "OpenAI Assistant with Vector Store", "type": "main", "index": 0}]]}}}How to Import This Workflow
- 1Copy the workflow JSON above using the Copy Workflow JSON button.
- 2Open your n8n instance and go to Workflows.
- 3Click Import from JSON and paste the copied workflow.
Don't have an n8n instance? Start your free trial at n8nautomation.cloud
Related Templates
Text to Speech (OpenAI)
Converts text into natural-sounding speech using OpenAI's Text-to-Speech API. It sends your input text to OpenAI and receives an audio file in return. This is useful for creating audio versions of articles, generating voiceovers for videos, or providing accessibility features for web content. Quickly transform written content into engaging audio.
LangChain - Example - Code Node Example
Explore a basic LangChain agent that answers questions using a custom tool. This workflow connects n8n's AI nodes and custom code nodes to OpenAI for language model interactions. It's useful for developers building custom AI assistants or researchers experimenting with agentic workflows. This saves development time by providing a ready-to-use example of a LangChain agent.
AI-Powered Candidate Shortlisting Automation for ERPNext
Automate AI-powered candidate shortlisting for ERPNext job applications. This workflow connects ERPNext, Google Gemini, WhatsApp, and Outlook to process resumes, evaluate candidates, and communicate outcomes. Recruiters and HR departments can use this to efficiently screen applicants, automatically reject unqualified candidates, and send acceptance notifications. It significantly reduces manual review time and streamlines the hiring process.