Ask questions about a PDF using AI

Effortlessly transform your Google Drive PDFs into an interactive knowledge base with this powerful AI workflow. This n8n automation connects your Google Drive files, processes them with OpenAI embeddings, and stores them in a Pinecone vector database, allowing you to ask questions and receive intelligent answers directly from your document content. When a new PDF is uploaded to Google Drive, the workflow automatically extracts its text, splits it into manageable chunks using the Recursive Character Text Splitter, generates embeddings via OpenAI, and then inserts this structured data into Pinecone for efficient retrieval. Later, by clicking the 'Chat' button, you can engage in a natural language conversation with your document, powered by the OpenAI Chat Model and the Question and Answer Chain, which retrieves relevant information from Pinecone. This is ideal for researchers needing to quickly extract insights from large reports, legal professionals analyzing contracts, or businesses creating searchable knowledge bases from their documentation, saving countless hours of manual review and information searching.

16 nodesmanual trigger134 views0 copiesData

Google DrivePineconeOpenAI

Workflow JSON

{"meta": {"instanceId": "62b3b6db4f4d3641a1fa1da6dfb9699a19380a1f60cbc18fc75d6d145f35552b"}, "nodes": [{"id": "40bb5497-d1d2-4eb7-b683-78b88c8d9230", "name": "Google Drive", "type": "n8n-nodes-base.googleDrive", "position": [496.83478320435574, 520], "parameters": {"fileId": {"__rl": true, "mode": "url", "value": "https://drive.google.com/file/d/11Koq9q53nkk0F5Y8eZgaWJUVR03I4-MM/view"}, "options": {}, "operation": "download"}, "credentials": {"googleDriveOAuth2Api": {"id": "", "name": "[Your googleDriveOAuth2Api]"}}, "typeVersion": 3}, {"id": "1323d520-1528-4a5a-9806-8f4f45306098", "name": "Recursive Character Text Splitter", "type": "@n8n/n8n-nodes-langchain.textSplitterRecursiveCharacterTextSplitter", "position": [996.8347832043557, 920], "parameters": {"chunkSize": 3000, "chunkOverlap": 200}, "typeVersion": 1}, {"id": "796b155a-64e6-4a52-9168-a37c68077d99", "name": "Embeddings OpenAI", "type": "@n8n/n8n-nodes-langchain.embeddingsOpenAi", "position": [836.8347832043557, 740], "parameters": {"options": {}}, "credentials": {"openAiApi": {"id": "", "name": "[Your openAiApi]"}}, "typeVersion": 1}, {"id": "dbe42c28-6f0b-4999-8372-0b42f6fb5916", "name": "Sticky Note", "type": "n8n-nodes-base.stickyNote", "position": [260, 420], "parameters": {"color": 7, "width": 978.0454109366399, "height": 806.6556079800943, "content": "### Load data into database\nFetch file from Google Drive, split it into chunks and insert into Pinecone index"}, "typeVersion": 1}, {"id": "43dc3736-834d-4322-8fd2-7826b0208c4b", "name": "Sticky Note1", "type": "n8n-nodes-base.stickyNote", "position": [1520, 420], "parameters": {"color": 7, "width": 654.1028019808174, "height": 806.8716167324012, "content": "### Chat with database\nEmbed the incoming chat message and use it retrieve relevant chunks from the vector store. These are passed to the model to formulate an answer "}, "typeVersion": 1}, {"id": "53b18460-8ad6-425a-a01f-c2295cfddde8", "name": "Default Data Loader", "type": "@n8n/n8n-nodes-langchain.documentDefaultDataLoader", "position": [996.8347832043557, 740], "parameters": {"options": {}, "dataType": "binary"}, "typeVersion": 1}, {"id": "e729a021-eab3-48fa-a818-457efcaeebb2", "name": "Sticky Note2", "type": "n8n-nodes-base.stickyNote", "position": [-20, 740], "parameters": {"height": 264.61498034081166, "content": "## Try me out\n1. In Pinecone, create an index with 1536 dimensions and select it in *both* Pinecone nodes\n2. Click 'test workflow' at the bottom of the canvas to load data into the vector store\n3. Click 'chat' at the bottom of the canvas to ask questions about the data"}, "typeVersion": 1}, {"id": "3e17c89c-620d-4892-b944-d792e48e3772", "name": "Question and Answer Chain", "type": "@n8n/n8n-nodes-langchain.chainRetrievalQa", "position": [1560, 521], "parameters": {}, "typeVersion": 1.2}, {"id": "516507f9-d0d9-4975-85d0-a7852ee41518", "name": "OpenAI Chat Model", "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi", "position": [1560, 741], "parameters": {"options": {}}, "credentials": {"openAiApi": {"id": "", "name": "[Your openAiApi]"}}, "typeVersion": 1}, {"id": "8b0a5d26-a60a-40ab-8200-72f542532096", "name": "Embeddings OpenAI2", "type": "@n8n/n8n-nodes-langchain.embeddingsOpenAi", "position": [1700, 1081], "parameters": {"options": {}}, "credentials": {"openAiApi": {"id": "", "name": "[Your openAiApi]"}}, "typeVersion": 1}, {"id": "07f61d20-cf50-48e8-9d34-92244af436cb", "name": "Vector Store Retriever", "type": "@n8n/n8n-nodes-langchain.retrieverVectorStore", "position": [1760, 741], "parameters": {}, "typeVersion": 1}, {"id": "0777de17-99a0-499a-b71f-245d5f76642e", "name": "Read Pinecone Vector Store", "type": "@n8n/n8n-nodes-langchain.vectorStorePinecone", "position": [1700, 921], "parameters": {"options": {}, "pineconeIndex": {"__rl": true, "mode": "list", "value": "test-index", "cachedResultName": "test-index"}}, "credentials": {"pineconeApi": {"id": "", "name": "[Your pineconeApi]"}}, "typeVersion": 1}, {"id": "cc5e6897-9d0b-4352-a882-5dc23104bf97", "name": "Insert into Pinecone vector store", "type": "@n8n/n8n-nodes-langchain.vectorStorePinecone", "position": [856.8347832043557, 520], "parameters": {"mode": "insert", "options": {"clearNamespace": true}, "pineconeIndex": {"__rl": true, "mode": "list", "value": "test-index", "cachedResultName": "test-index"}}, "credentials": {"pineconeApi": {"id": "", "name": "[Your pineconeApi]"}}, "typeVersion": 1}, {"id": "c358aa73-b60f-453f-a3ef-539faa98c9b5", "name": "When clicking 'Chat' button below", "type": "@n8n/n8n-nodes-langchain.chatTrigger", "position": [1360, 521], "webhookId": "e259b6fe-b2a9-4dbc-98a4-9a160e7ac10c", "parameters": {}, "typeVersion": 1}, {"id": "d35db9e1-4efc-4980-9814-55fbe65e08fd", "name": "When clicking 'Test Workflow' button", "type": "n8n-nodes-base.manualTrigger", "position": [76.83478320435574, 520], "parameters": {}, "typeVersion": 1}, {"id": "4c04f576-e834-467d-98b4-38a2d501d82f", "name": "Set Google Drive file URL", "type": "n8n-nodes-base.set", "position": [296, 520], "parameters": {"options": {}, "assignments": {"assignments": [{"id": "50025ff5-1b53-475f-b150-2aafef1c4c21", "name": "file_url", "type": "string", "value": "https://drive.google.com/file/d/11Koq9q53nkk0F5Y8eZgaWJUVR03I4-MM/view"}]}}, "typeVersion": 3.3}], "pinData": {}, "connections": {"Google Drive": {"main": [[{"node": "Insert into Pinecone vector store", "type": "main", "index": 0}]]}, "Embeddings OpenAI": {"ai_embedding": [[{"node": "Insert into Pinecone vector store", "type": "ai_embedding", "index": 0}]]}, "OpenAI Chat Model": {"ai_languageModel": [[{"node": "Question and Answer Chain", "type": "ai_languageModel", "index": 0}]]}, "Embeddings OpenAI2": {"ai_embedding": [[{"node": "Read Pinecone Vector Store", "type": "ai_embedding", "index": 0}]]}, "Default Data Loader": {"ai_document": [[{"node": "Insert into Pinecone vector store", "type": "ai_document", "index": 0}]]}, "Vector Store Retriever": {"ai_retriever": [[{"node": "Question and Answer Chain", "type": "ai_retriever", "index": 0}]]}, "Set Google Drive file URL": {"main": [[{"node": "Google Drive", "type": "main", "index": 0}]]}, "Read Pinecone Vector Store": {"ai_vectorStore": [[{"node": "Vector Store Retriever", "type": "ai_vectorStore", "index": 0}]]}, "Recursive Character Text Splitter": {"ai_textSplitter": [[{"node": "Default Data Loader", "type": "ai_textSplitter", "index": 0}]]}, "When clicking 'Chat' button below": {"main": [[{"node": "Question and Answer Chain", "type": "main", "index": 0}]]}, "When clicking 'Test Workflow' button": {"main": [[{"node": "Set Google Drive file URL", "type": "main", "index": 0}]]}}}

How to Import This Workflow

1Copy the workflow JSON above using the Copy Workflow JSON button.
2Open your n8n instance and go to Workflows.
3Click Import from JSON and paste the copied workflow.

Don't have an n8n instance? Start your free trial at n8nautomation.cloud

Related Templates

ETL pipeline

Automate your data extraction, transformation, and loading with this robust ETL pipeline, designed to efficiently process and analyze information from various sources. This workflow begins on a schedule, fetching tweets from Twitter/X, then storing them in MongoDB for initial processing. The MongoDB data is then sent to Google Cloud Natural Language for sentiment analysis or entity extraction, with the results subsequently prepared and stored in PostgreSQL. A conditional check on the PostgreSQL data determines whether to send an alert to Slack, ensuring timely notifications for critical insights or anomalies. This powerful automation is ideal for marketing teams monitoring brand sentiment, researchers analyzing public opinion, or businesses tracking competitor activity, providing actionable intelligence without manual data handling. By automating data ingestion, enrichment, and storage, this workflow significantly reduces the time and effort spent on data preparation, allowing teams to focus on analysis and strategic decision-making while ensuring data consistency and accessibility.

9 nodes

SQL agent with memory

Empower your data analysis with the SQL agent with memory workflow, automating the process of querying databases using natural language. This powerful workflow connects OpenAI's advanced language models with your local SQL databases, allowing you to interact with your data through a conversational interface. Initially, the workflow downloads a chinook.zip example database, extracts it, and saves the chinook.db file locally, making it immediately available for querying. The AI Agent, powered by OpenAI Chat Model and supported by a Window Buffer Memory, interprets your natural language questions, translates them into SQL queries, executes them against your local chinook.db, and provides the results back to you. This is incredibly useful for data analysts, business intelligence professionals, or anyone needing quick insights from their databases without writing complex SQL queries, significantly reducing the time and specialized knowledge required for data exploration. By leveraging the Chat Trigger, users can easily initiate conversations and receive immediate, intelligent responses, streamlining data access and accelerating decision-making.

13 nodes

Prepare CSV files with GPT-4

Transform raw, unstructured text into perfectly formatted CSV files using the power of GPT-4 with this n8n workflow. This automation connects OpenAI's advanced language model to process your input, then meticulously structures the output into a usable CSV format. Ideal for data analysts, marketers, or researchers, this workflow helps you extract specific information from large text datasets, such as customer reviews, survey responses, or article summaries, and prepare it for analysis in spreadsheets or databases. By automating the extraction and formatting of data, you significantly reduce manual data entry errors and save countless hours of tedious work, allowing you to focus on insights rather than data preparation. The workflow manually triggers, sending your text to OpenAI, then splits the responses into manageable batches, parses the JSON output, converts it into a structured table, and finally saves a clean, UTF-8 encoded CSV file to disk, ensuring compatibility across various systems.

11 nodes

Ready to automate with n8n?

Get affordable managed n8n hosting with 24/7 support.