To set up our knowledge base, we begin by installing and importing necessary Python libraries using HTML tags. The following code snippet demonstrates how to install the required libraries:
“`html
Imports
“`
We start by installing and importing necessary Python libraries:
“`html
!pip install llama-index
!pip install llama-index-embeddings-huggingface
!pip install peft
!pip install auto-gptq
!pip install optimum
!pip install bitsandbytes
# if not running on Colab ensure transformers is installed too
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings, SimpleDirectoryReader, VectorStoreIndex
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.postprocessor import SimilarityPostprocessor
“`
Next, we can configure our knowledge base by defining our embedding model, chunk size, and chunk overlap. We use the ~33M parameter bge-small-en-v1.5 embedding model from BAAI, which is available on the Hugging Face hub. Other embedding model options are available on this text embedding leaderboard. The following snippet of code demonstrates how to set up the knowledge base:
“`html
Setting up Knowledge Base
“`
We start by configuring our knowledge base with the following settings:
“`html
Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
Settings.llm = None
Settings.chunk_size = 256
Settings.chunk_overlap = 25
```
Then, we load our source documents from a folder called "articles" containing PDF versions of 3 Medium articles on fat tails. For each file in the folder, the code reads the text from the PDF, splits it into chunks, and refines the chunks by removing irrelevant text. Finally, the refined chunks are stored in a vector database. The code snippet below demonstrates this process:
```html
Next, we load our source documents
```
```html
documents = SimpleDirectoryReader("articles").load_data()
for doc in documents:
if "Member-only story" in doc.text:
documents.remove(doc)
continue
if "The Data Entrepreneurs" in doc.text:
documents.remove(doc)
if " min read" in doc.text:
documents.remove(doc)
```
Finally, we can create a retriever using LlamaIndex's VectorIndexRetriever and define a query engine that uses the retriever to return relevant chunks based on a user query. The following code snippet demonstrates how to set up the retriever and query engine:
```html
Setting up Retriever
```
```html
top_k = 3
retriever = VectorIndexRetriever(index=index, similarity_top_k=top_k)
query_engine = RetrieverQueryEngine(retriever=retriever, node_postprocessors=[SimilarityPostprocessor(similarity_cutoff=0.5)])
```
Now, with our knowledge base and retrieval system set up, we can use it to return chunks relevant to a query. By passing a technical question to the query engine, we can retrieve relevant information. The code snippet below demonstrates how to use the query engine:
```html
Use Query Engine
```
```html
query = "What is fat-tailedness?"
response = query_engine.query(query)
```
The above code will return a response object containing relevant chunks of text. This information can be further processed and displayed for readability.
Source link