Giant Language Fashions (LLMs) are outstanding at compressing data concerning the world into their billions of parameters.
Nonetheless, LLMs have two main limitations: They solely have up-to-date data as much as the time of the final coaching iteration. They usually generally are inclined to make up data (hallucinate) when requested particular questions.
Utilizing the RAG method, we can provide pre-trained LLMs entry to very particular data as extra context when answering our questions.
On this article, I’ll stroll by way of the idea and observe of implementing Google’s LLM Gemma with extra RAG capabilities utilizing the Hugging Face transformers library, LangChain, and the Faiss vector database.
An summary of the RAG pipeline is proven within the determine beneath, which we’ll implement step-by-step.