RAG:Context-Aware Chunking | Google Drive to Pinecone via OpenRouter & Gemini

Ingest and vectorize your Google Drive documents into Pinecone with context-aware chunking, powered by OpenRouter and Gemini, to build robust RAG applications. This workflow automates the process of extracting text from Google Drive documents, intelligently splitting the content into context-rich chunks, generating embeddings with Google Gemini, and then storing these vectors in Pinecone for efficient retrieval. Triggered manually, it begins by retrieving a specified document from Google Drive using the Get Document From Google Drive node, then extracts its text content with Extract Text Data From Google Document. The Recursive Character Text Splitter node then divides the document into manageable sections, which are prepared for looping by Prepare Sections For Looping. Each section is then processed through a Loop Over Items node, where an AI Agent - Prepare Context node dynamically generates relevant context using OpenRouter Chat Model, before concatenating this context with the section text. Finally, Embeddings Google Gemini creates vector embeddings for these enriched chunks, which are then upserted into your Pinecone Vector Store. This solution is ideal for businesses and developers building advanced RAG systems, knowledge bases, or intelligent search applications that require precise contextual understanding of their Google Drive data, saving significant manual effort in data preparation and ensuring higher quality AI responses.
17 nodesmanual trigger115 views0 copiesProductivity
Google DrivePinecone

Workflow JSON

{"id": "VY4WBXuNDPxmOO5e", "meta": {"instanceId": "d16fb7d4b3eb9b9d4ad2ee6a7fbae593d73e9715e51f583c2a0e9acd1781c08e", "templateCredsSetupCompleted": true}, "name": "RAG:Context-Aware Chunking | Google Drive to Pinecone via OpenRouter & Gemini", "tags": [{"id": "XZIQK6NdzGvgbZFd", "name": "Sell", "createdAt": "2025-01-15T12:28:48.424Z", "updatedAt": "2025-01-15T12:28:48.424Z"}], "nodes": [{"id": "7abbfa6e-4b17-4656-9b82-377b1bacf539", "name": "When clicking \u2018Test workflow\u2019", "type": "n8n-nodes-base.manualTrigger", "position": [0, 0], "parameters": {}, "typeVersion": 1}, {"id": "448ec137-bf64-46b4-bf15-c7a040faa306", "name": "Loop Over Items", "type": "n8n-nodes-base.splitInBatches", "position": [1100, 0], "parameters": {"options": {}}, "typeVersion": 3}, {"id": "f22557ee-7f37-40cd-9063-a9a759274663", "name": "OpenRouter Chat Model", "type": "@n8n/n8n-nodes-langchain.lmChatOpenRouter", "position": [20, 440], "parameters": {"options": {}}, "credentials": {"openRouterApi": {"id": "", "name": "[Your openRouterApi]"}}, "typeVersion": 1}, {"id": "57e8792e-25ae-43d5-b4e9-e87642365ee9", "name": "Pinecone Vector Store", "type": "@n8n/n8n-nodes-langchain.vectorStorePinecone", "position": [780, 360], "parameters": {"mode": "insert", "options": {}, "pineconeIndex": {"__rl": true, "mode": "list", "value": "context-rag-test", "cachedResultName": "context-rag-test"}}, "credentials": {"pineconeApi": {"id": "", "name": "[Your pineconeApi]"}}, "typeVersion": 1}, {"id": "0a8c2426-0aaf-424a-b246-336a9034aba8", "name": "Embeddings Google Gemini", "type": "@n8n/n8n-nodes-langchain.embeddingsGoogleGemini", "position": [720, 540], "parameters": {"modelName": "models/text-embedding-004"}, "credentials": {"googlePalmApi": {"id": "", "name": "[Your googlePalmApi]"}}, "typeVersion": 1}, {"id": "edc587bd-494d-43e8-b6d6-26adab7af3dc", "name": "Default Data Loader", "type": "@n8n/n8n-nodes-langchain.documentDefaultDataLoader", "position": [920, 540], "parameters": {"options": {}}, "typeVersion": 1}, {"id": "a82d4e0b-248e-426d-9ef3-f25e7078ceb3", "name": "Recursive Character Text Splitter", "type": "@n8n/n8n-nodes-langchain.textSplitterRecursiveCharacterTextSplitter", "position": [840, 680], "parameters": {"options": {}, "chunkSize": 100000}, "typeVersion": 1}, {"id": "8571b92f-5587-454f-9700-ea04ca35311b", "name": "Get Document From Google Drive", "type": "n8n-nodes-base.googleDrive", "position": [220, 0], "parameters": {"fileId": {"__rl": true, "mode": "list", "value": "1gm0jxFTLuiWB5u4esEjzoCPImrVqu0AEMIKBIesTf9M", "cachedResultUrl": "https://docs.google.com/document/d/1gm0jxFTLuiWB5u4esEjzoCPImrVqu0AEMIKBIesTf9M/edit?usp=drivesdk", "cachedResultName": "Udit Rawat - Details"}, "options": {"googleFileConversion": {"conversion": {"docsToFormat": "text/plain"}}}, "operation": "download"}, "credentials": {"googleDriveOAuth2Api": {"id": "", "name": "[Your googleDriveOAuth2Api]"}}, "typeVersion": 3}, {"id": "2bed3d0f-3d65-4394-87f1-e73320a43a4a", "name": "Extract Text Data From Google Document", "type": "n8n-nodes-base.extractFromFile", "position": [440, 0], "parameters": {"options": {}, "operation": "text"}, "typeVersion": 1}, {"id": "837fa691-6c66-434b-ba82-d1cad9aecdf7", "name": "Split Document Text Into Sections", "type": "n8n-nodes-base.code", "position": [660, 0], "parameters": {"jsCode": "let split_text = \"\u2014---------------------------\u2014-------------[SECTIONEND]\u2014---------------------------\u2014-------------\";\nfor (const item of $input.all()) {\n item.json.section = item.json.data.split(split_text);\n item.json.document = JSON.stringify(item.json.section)\n}\nreturn $input.all();"}, "typeVersion": 2}, {"id": "cc801e7e-e01b-421a-9211-08322ef8a0b2", "name": "Prepare Sections For Looping", "type": "n8n-nodes-base.splitOut", "position": [880, 0], "parameters": {"options": {}, "fieldToSplitOut": "section"}, "typeVersion": 1}, {"id": "658cb8df-92e3-4b25-8f37-e5f959d913dc", "name": "Sticky Note", "type": "n8n-nodes-base.stickyNote", "position": [-40, -100], "parameters": {"width": 1300, "height": 280, "content": "## Prepare Document. \nThis section is responsible for downloading the file from Google Drive, splitting the text into sections by detecting separators, and preparing them for looping."}, "typeVersion": 1}, {"id": "82ee9194-484a-46db-b75c-bec34201c7e2", "name": "Sticky Note1", "type": "n8n-nodes-base.stickyNote", "position": [-220, 220], "parameters": {"width": 780, "height": 360, "content": "## Prepare context\nIn this section, the \nagent node will prepare \ncontext for a section \n(chunk of text), which \nwill then be passed for \nconversion into a vectors \nalong with the section itself."}, "typeVersion": 1}, {"id": "2f6950df-ead1-479a-aa51-7768121a4eb2", "name": "AI Agent - Prepare Context", "type": "@n8n/n8n-nodes-langchain.agent", "position": [40, 260], "parameters": {"text": "=<document> \n{{ $('Split Document Text Into Sections').item.json.document }}\n</document> \nHere is the chunk we want to situate within the whole document \n<chunk> \n{{ $json.section }}\n</chunk> \nPlease give a short succinct context to situate this chunk within the overall document for the purposes of improving search retrieval of the chunk. Answer only with the succinct context and nothing else. ", "agent": "conversationalAgent", "options": {}, "promptType": "define"}, "typeVersion": 1.7}, {"id": "34a465fc-a505-445a-9211-bcd830381354", "name": "Concatenate the context and section text", "type": "n8n-nodes-base.set", "position": [400, 260], "parameters": {"options": {}, "assignments": {"assignments": [{"id": "e5fb0381-5d23-46e2-a0d1-438240b80a3e", "name": "=section_chunk", "type": "string", "value": "={{ $json.output }}. {{ $('Loop Over Items').item.json.section }}"}]}}, "typeVersion": 3.4}, {"id": "4a7a788c-8e5b-453c-ae52-a4522048992d", "name": "Sticky Note2", "type": "n8n-nodes-base.stickyNote", "position": [640, 220], "parameters": {"width": 580, "height": 600, "content": "## Convert Text To Vectors\nIn this step, the Pinecone node converts the provided text into vectors using Google Gemini and stores them in the Pinecone vector database."}, "typeVersion": 1}, {"id": "45798b49-fc78-417c-a752-4dd1a8882cd7", "name": "Sticky Note3", "type": "n8n-nodes-base.stickyNote", "position": [-460, -120], "parameters": {"width": 400, "height": 300, "content": "## Video Demo\n[![Video Thumbnail](https://img.youtube.com/vi/qBeWP65I4hg/maxresdefault.jpg)](https://www.youtube.com/watch?v=qBeWP65I4hg)"}, "typeVersion": 1}], "active": false, "pinData": {}, "settings": {"executionOrder": "v1"}, "versionId": "4f0e2203-5850-4a32-b1dd-5adc57fa43ff", "connections": {"Loop Over Items": {"main": [[], [{"node": "AI Agent - Prepare Context", "type": "main", "index": 0}]]}, "Default Data Loader": {"ai_document": [[{"node": "Pinecone Vector Store", "type": "ai_document", "index": 0}]]}, "OpenRouter Chat Model": {"ai_languageModel": [[{"node": "AI Agent - Prepare Context", "type": "ai_languageModel", "index": 0}]]}, "Pinecone Vector Store": {"main": [[{"node": "Loop Over Items", "type": "main", "index": 0}]]}, "Embeddings Google Gemini": {"ai_embedding": [[{"node": "Pinecone Vector Store", "type": "ai_embedding", "index": 0}]]}, "AI Agent - Prepare Context": {"main": [[{"node": "Concatenate the context and section text", "type": "main", "index": 0}]]}, "Prepare Sections For Looping": {"main": [[{"node": "Loop Over Items", "type": "main", "index": 0}]]}, "Get Document From Google Drive": {"main": [[{"node": "Extract Text Data From Google Document", "type": "main", "index": 0}]]}, "Recursive Character Text Splitter": {"ai_textSplitter": [[{"node": "Default Data Loader", "type": "ai_textSplitter", "index": 0}]]}, "Split Document Text Into Sections": {"main": [[{"node": "Prepare Sections For Looping", "type": "main", "index": 0}]]}, "When clicking \u2018Test workflow\u2019": {"main": [[{"node": "Get Document From Google Drive", "type": "main", "index": 0}]]}, "Extract Text Data From Google Document": {"main": [[{"node": "Split Document Text Into Sections", "type": "main", "index": 0}]]}, "Concatenate the context and section text": {"main": [[{"node": "Pinecone Vector Store", "type": "main", "index": 0}]]}}}

How to Import This Workflow

  1. 1Copy the workflow JSON above using the Copy Workflow JSON button.
  2. 2Open your n8n instance and go to Workflows.
  3. 3Click Import from JSON and paste the copied workflow.

Don't have an n8n instance? Start your free trial at n8nautomation.cloud

Related Templates

Auto-create TikTok videos with VEED.io AI avatars, ElevenLabs & GPT-4

Automate the creation and distribution of trending TikTok videos using AI avatars. This workflow connects Telegram, Perplexity, OpenAI, ElevenLabs, VEED.io, and BLOTATO to generate scripts, synthesize voice, create video, and publish across multiple social platforms. Content creators and marketers can rapidly produce engaging short-form video content without manual editing.

35 nodes

CV Screening with OpenAI

Streamline your hiring process by automating the initial screening of CVs with this powerful workflow. It connects directly to OpenAI to analyze resumes, extracting key information and evaluating candidates based on your criteria. This workflow is ideal for recruiters, HR professionals, and hiring managers who need to quickly assess a large volume of applications, saving significant time and effort in the early stages of recruitment. By automating the parsing of PDF documents and leveraging OpenAI's analytical capabilities, you can efficiently identify top candidates, reduce manual review time, and focus on more strategic aspects of the hiring process. This solution drastically cuts down on the hours spent manually reading CVs, allowing for faster shortlisting and improving overall recruitment efficiency.

11 nodes

Create daily historical AI videos with Gemini, fal.ai, Telegram and YouTube

Automate the creation and publishing of daily historical AI videos. This workflow connects Gemini for script generation, fal.ai for video creation, Telegram for approval, and YouTube for publishing. Content creators or educators can use this to consistently deliver engaging historical content without manual video production. It significantly reduces the time and effort involved in daily video creation and distribution.

30 nodes

Ready to automate with n8n?

Get affordable managed n8n hosting with 24/7 support.