blog image

Introduction

In today's fast-paced world, chatbots have become increasingly popular for businesses and individuals alike. These automated conversational agents provide quick and efficient responses to user inquiries, saving time and resources. However, most chatbots rely on predefined responses or simple rule-based systems, limiting their ability to handle complex queries. In this blog post, we'll explore how I built a chatbot with a custom knowledge base using LangChain, an open-source framework designed to leverage the power of large language models like GPT-3.

Understanding LangChain

LangChain is a versatile framework that simplifies the creation of applications utilizing large language models. Its use cases span various domains, including document analysis, summarization, chatbots, and code analysis. By integrating LangChain into our chatbot project, we unlock the potential of language models to provide accurate and context-aware responses based on a custom knowledge base. langchain Documentation

Project Overview

The chatbot I developed consists of two main stages. In the first stage, the chatbot ingests data in the form of text files, which can be tailored to suit the specific needs of the project. To accommodate the token limit imposed by GPT-3 (currently set at 4000 tokens), the text files are split into smaller chunks of 500 tokens each. To ensure meaningful chunks, an overlap of 100 characters is maintained between adjacent chunks. These chunks are then transformed into vectors using embeddings AI, such as the OpenAI model "Ada v2." The resulting vectors are stored in a compatible database, such as Pinecone, which supports vector-based operations.

In the second stage, the chatbot handles user inputs and generates appropriate responses. The user's input is sanitized according to OpenAI's recommendation of replacing newlines with spaces for optimal results. The chat history, along with the sanitized message, is combined and passed to the language model. This step creates a standalone question based on the chat history and the user's question. The standalone question is then transformed into vectors and subjected to a semantic search within the Pinecone database. The relevant chunks, along with the standalone question, are retrieved from the database and provided to the GPT model. The model can now answer questions based on the given context, thereby eliminating hallucination and improving the accuracy of responses. This integration of custom data with large language models opens up a world of possibilities for leveraging AI in a more targeted and controlled manner.

Implementation

To implement the custom knowledge-based chatbot using LangChain, follow the steps outlined below:

Step 1: Setting up Pinecone Index

import { PineconeClient } from '@pinecone-database/pinecone'; import { VectorOperationsApi } from '@pinecone-database/pinecone/dist/pinecone-generated-ts-fetch'; const INDEX_NAME = 'blog-index'; export const createPineconeIndex = async (client: PineconeClient, indexName = INDEX_NAME, vectorDimension = 1536) => { // Check if the index already exists const existingIndexes = await client.listIndexes(); if (!existingIndexes.includes(indexName)) { console.log('🚀 ~ Creating pinecone index...'); // Create the index with the specified name, vector dimension, and metric const index = await client.createIndex({ createRequest: { name: indexName, dimension: vectorDimension, metric: 'cosine', }, }); // Wait for the index to be ready (60 seconds is used as an example, adjust as needed) await new Promise((resolve) => setTimeout(resolve, 60000)); return console.log('🚀 ~ Created Index:', indexName); } return console.log(`🚀 ~ Index ${indexName} already exists`); };

This code sets up the Pinecone index for storing vectorized chunks of text. It checks if the index already exists and creates a new index if it doesn't. The vector dimension and metric are specified during index creation. we wait for a minute till pinecone index is ready.

Step 2: Updating Pinecone Index

import { Document } from 'langchain/document'; export const updatePinecone = async (client: PineconeClient, docs: Document<Metadata>[][], indexName = INDEX_NAME) => { const index = client.Index(indexName); if (!index) return console.log('🚀 ~ Index does not exist'); console.log('🚀 ~ Updating pinecone index...'); await createAndSaveEmbeddings(docs, index); return console.log('🚀 ~ Updated Index:', indexName); };

The updatePinecone function updates the existing Pinecone index with new documents. It calls the createAndSaveEmbeddings function to generate embeddings for the documents and store them in the index.

Step 3: Splitting Text Files into Chunks

import { DirectoryLoader } from 'langchain/document_loaders/fs/directory'; import { TextLoader } from 'langchain/document_loaders/fs/text'; import { RecursiveCharacterTextSplitter } from 'langchain/text_splitter'; export const splitIntoChunks = async (path: string) => { const loader = new DirectoryLoader(path, { '.txt': (path) => new TextLoader(path), }); const docs = await loader.load(); const splitter = async (text: string, metadata: Record<'source', string>) => { const splitter = RecursiveCharacterTextSplitter.fromLanguage('markdown', { chunkSize: 500, chunkOverlap: 100, }); const docOutput = await splitter.createDocuments([text], [metadata]); return docOutput as Document<Metadata>[]; }; const chunks = await Promise.all(docs.map((doc) => splitter(doc.pageContent, doc.metadata))); return chunks; };

This code splits the text files in a specified directory into smaller chunks of 500 characters each, with a 100-character overlap. It uses the DirectoryLoader and TextLoader from LangChain to load the text files and the RecursiveCharacterTextSplitter to split the content into chunks.

Step 4: Generating Embeddings and Saving to Pinecone

import { OpenAIEmbeddings } from 'langchain/embeddings/openai'; import { PineconeStore } from 'langchain/vectorstores/pinecone'; const createAndSaveEmbeddings = async (docs: Document<Metadata>[][], index: VectorOperationsApi) => { const chunks = docs.flat(); const embeddings = new OpenAIEmbeddings({ maxConcurrency: 5, }); console.log('🚀 ~ chunks size:', chunks.length); await PineconeStore.fromDocuments(chunks, embeddings, { pineconeIndex: index, }); return; };

The createAndSaveEmbeddings function generates embeddings for the document chunks using the specified embeddings model (e.g., text-embedding-ada-002). It then saves the embeddings to the Pinecone index using PineconeStore.

Step 5: Asking Questions to the Chatbot

import { ChatOpenAI } from 'langchain/chat_models/openai'; import { BufferMemory } from 'langchain/memory'; import { ConversationalRetrievalQAChain } from 'langchain/chains'; import { PromptTemplate } from 'langchain/prompts'; export const askAI = async (question: string, client: PineconeClient) => { const pineconeIndex = client.Index(INDEX_NAME); const vectorStore = await PineconeStore.fromExistingIndex(new OpenAIEmbeddings(), { pineconeIndex }); const sanitizedQuestion = question.trim().replaceAll('\n', ' '); const llm = new ChatOpenAI({ modelName: 'gpt-3.5-turbo', temperature: 0, cache: true, }); const qa_template = ` Use the following pieces of context to answer the question at the end. If you don't know the answer, just say "Sorry, I don't know" and don't try to make up an answer. {context} Question: {question} Helpful Answer In markdown format. If the answer consists of multiple parts, add relevant titles to each section. The titles should be h6 or h5 in size. `; const chain = ConversationalRetrievalQAChain.fromLLM(llm, vectorStore.asRetriever(), { returnSourceDocuments: true, memory: new BufferMemory({ memoryKey: 'chat_history', inputKey: 'question', outputKey: 'text', returnMessages: true, }), verbose: true, qaChainOptions: { type: 'stuff', prompt: PromptTemplate.fromTemplate(qa_template), }, }); const answer = (await chain.call({ question: sanitizedQuestion, })) as Answer; return answer; };

The askAI function handles the conversational interaction with the chatbot. It takes a question as input, retrieves the corresponding Pinecone index and vector store, and uses the LangChain components (ChatOpenAI, BufferMemory, and ConversationalRetrievalQAChain) to generate a response. The response is returned as an Answer object, which includes the answer text and the source documents used for retrieval.

How does the chatbot leverage contextual information and semantic search to generate accurate responses?

To generate a response, the function combines the user's question with the chat history, ensuring that the conversation context is taken into account. This combination forms a standalone question that captures the relevant information needed to generate a meaningful answer.

The standalone question is then transformed into vectors using the chosen embeddings model, such as OpenAI's language model. These vectors are used for a semantic search within the Pinecone database. The goal is to find the most relevant document chunks that match the standalone question based on their vector representations.

Once the relevant chunks are retrieved from the database, they, along with the standalone question, are provided to the GPT model. By incorporating the retrieved context and relevant information, the GPT model can generate accurate and contextually appropriate answers to the user's question.

Conclusion

The development of a custom knowledge-based chatbot using LangChain represents a significant leap forward in the accuracy and effectiveness of automated conversational agents. By integrating large language models with our own data and leveraging semantic search capabilities, we can create chatbots that provide highly relevant and context-aware responses. This project represents just the beginning of the potential applications and advancements that can be achieved by combining custom data with state-of-the-art AI models. With LangChain's powerful framework and the ever-evolving landscape of language models, the possibilities for future innovations in chatbot development are vast.

Remember, building a chatbot is a dynamic process, and continuous experimentation and refinement are key to optimizing its performance. By harnessing the capabilities of LangChain and the advancements in large language models, we can create intelligent chatbots that revolutionize the way we interact with AI.

live demo: click here Happy building!