From Zero to AI Hero: Create Your Custom Chatbot with LlamaIndex

3 min read1 day ago

LlamaIndex is an open-source framework that lets you connect data sources to large language models (LLMs). It’s used to build applications like chatbots and knowledge agents.

Llamaindex has the following features

Data integration: Integrates data from a variety of sources, including vector stores, document stores, graph stores, and SQL databases
Querying: Orchestrates workflows for querying data, including prompt chains, advanced RAG, and agents
Performance evaluation: Measures retrieval and LLM response quality
Agent architecture: Breaks down complex questions, plans out tasks, and calls APIs

Now, let us try Llamaindex on our sample data.

I have exported my LinkedIn resume as a pdf & we will use it as a sample input data to query.

Open google colab from here & create new notebook

Let us write python code to query the resume. We will use following packages -

llama-index to get access to its querying functions
openai to send the queries to an LLM model
pypdf to interact with the pdf files

!pip install llama-index openai pypdf

import os
os.environ["OPENAI_API_KEY"] = "<YOUR_API_KEY>"

You need to create an index on this input data

from llama_index.core import TreeIndex, SimpleDirectoryReader

resume = SimpleDirectoryReader("Private-Data").load_data()
new_index = TreeIndex.from_documents(resume)

Now, we are ready to query resume using query engine function

query_engine = new_index.as_query_engine()
response = query_engine.query("When did Ashish graduate?")
print(response)
print(query_engine.query("What certifications do Ashish have?"))
print(query_engine.query("What skills do Ashish have?"))

As you can see the model does a reasonably good job & provides accurate results till now.

Let us ask more questions but this time via chat engine

query_engine = new_index.as_chat_engine()
print(query_engine.chat("Ashish was in which company in 2020"))
print(query_engine.chat("After Schlumberger which companies did he work for?"))

As you can see from the output, the model hallucinates & provides in accurate answer. It should have mentioned Acquia.

This hands-on tutorial demonstrates a simple AI chatbot on your personal data using Llamaindex.

One can use a combination of Ollama & Mistral (instead of Open AI) to send the queries to a local llm model without a need for api key & rate limiting.

Please note that creation of index is a time consuming process which can greatly increase if the input data is huge. In such cases, you could consider creating it once & store it on the file system

new_index.storage_context.persist()

Once that is done, we can quickly load the storage context and create an index when it is needed

from llama_index import StorageContext, load_index_from_storage

storage_context = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage_context)

Thats it for this tutorial. If you liked my work, consider giving a few claps & follow me on linkedin for more such updates

Written by Ashish Agarwal

No responses yet