spinny:~/writing $ vim rag-langchain-deep-dive.md
1~2Modelele de limbaj mari (LLM-uri) precum GPT-4 si Claude sunt extraordinar de puternice, dar sufera de o limitare fundamentala: cunostintele lor sunt inghetate la momentul antrenamentului. **Retrieval-Augmented Generation (RAG)** rezolva exact aceasta problema combinand puterea generativa a LLM-urilor cu capacitatea de a recupera informatii din surse externe.3~4## Problema: Limitarile LLM5~61. **Cunostinte statice**: Un LLM stie doar ce a vazut in timpul antrenamentului.72. **Halucinatii**: Cand un LLM nu stie raspunsul, tinde sa fabrice unul.83. **Fara acces la date private**: Un LLM generic nu are acces la documentatia interna a companiei tale.9~10## Ce este RAG?11~12RAG este o arhitectura care imbogateste promptul trimis unui LLM cu informatii recuperate dintr-o baza de cunostinte externa.13~14```mermaid15graph LR16 User["User"] -- "Question" --> Retriever17 Retriever -- "Search relevant\ndocuments" --> VectorStore["Vector Store"]18 VectorStore -- "Relevant\ndocuments" --> Retriever19 Retriever -- "Context + Question" --> LLM20 LLM -- "Grounded\nresponse" --> User21```22~23## Cum functioneaza RAG in detaliu24~25### Faza 1: Indexare (Ingestia documentelor)26~27```mermaid28graph TD29 A["Documents\n(PDF, HTML, MD, DB)"] --> B["Document Loader"]30 B --> C["Text Splitter"]31 C --> D["Text Chunks"]32 D --> E["Embedding Model"]33 E --> F["Numerical Vectors"]34 F --> G["Vector Store\n(ChromaDB, Pinecone, FAISS)"]35```36~37### Faza 2: Recuperare + Generare38~391. Intrebarea este transformata intr-un embedding.402. Vector Store gaseste cele mai similare fragmente.413. Fragmentele recuperate sunt inserate in prompt ca context.424. LLM-ul genereaza un raspuns bazat pe context.43~44## Construirea unui pipeline RAG cu LangChain45~46```bash47pip install langchain langchain-openai langchain-community chromadb48```49~50### Pasul 1: Incarcarea documentelor51~52```python53from langchain_community.document_loaders import (54 PyPDFLoader,55 WebBaseLoader,56 DirectoryLoader,57 TextLoader,58)59~60pdf_loader = PyPDFLoader("docs/manual.pdf")61pdf_docs = pdf_loader.load()62~63web_loader = WebBaseLoader("https://docs.example.com/guide")64web_docs = web_loader.load()65~66dir_loader = DirectoryLoader("./knowledge_base", glob="**/*.md", loader_cls=TextLoader)67md_docs = dir_loader.load()68~69all_docs = pdf_docs + web_docs + md_docs70```71~72### Pasul 2: Impartirea documentelor in fragmente73~74```python75from langchain.text_splitter import RecursiveCharacterTextSplitter76~77text_splitter = RecursiveCharacterTextSplitter(78 chunk_size=1000,79 chunk_overlap=200,80 separators=["\n\n", "\n", ". ", " ", ""],81)82~83chunks = text_splitter.split_documents(all_docs)84```85~86### Pasul 3: Crearea Embedding-urilor si Vector Store87~88```python89from langchain_openai import OpenAIEmbeddings90from langchain_community.vectorstores import Chroma91~92embedding_model = OpenAIEmbeddings(model="text-embedding-3-small")93~94vectorstore = Chroma.from_documents(95 documents=chunks,96 embedding=embedding_model,97 persist_directory="./chroma_db",98)99```100~101### Pasul 4: Crearea Retriever-ului102~103```python104retriever = vectorstore.as_retriever(105 search_type="similarity",106 search_kwargs={"k": 4},107)108```109~110### Pasul 5: Construirea lantului RAG111~112```python113from langchain_openai import ChatOpenAI114from langchain_core.prompts import ChatPromptTemplate115from langchain_core.runnables import RunnablePassthrough116from langchain_core.output_parsers import StrOutputParser117~118llm = ChatOpenAI(model="gpt-4o", temperature=0)119~120prompt = ChatPromptTemplate.from_template("""121Answer the question based only on the provided context.122If the context does not contain enough information, say you don't know.123~124Context:125{context}126~127Question: {question}128~129Answer:130""")131~132def format_docs(docs):133 return "\n\n".join(doc.page_content for doc in docs)134~135rag_chain = (136 {"context": retriever | format_docs, "question": RunnablePassthrough()}137 | prompt138 | llm139 | StrOutputParser()140)141~142response = rag_chain.invoke("How does authentication work in the system?")143print(response)144```145~146## Tehnici RAG avansate147~148### Recuperare Multi-Query149~150```python151from langchain.retrievers import MultiQueryRetriever152~153multi_retriever = MultiQueryRetriever.from_llm(154 retriever=vectorstore.as_retriever(),155 llm=llm,156)157```158~159### Compresie contextuala160~161```python162from langchain.retrievers import ContextualCompressionRetriever163from langchain.retrievers.document_compressors import LLMChainExtractor164~165compressor = LLMChainExtractor.from_llm(llm)166compression_retriever = ContextualCompressionRetriever(167 base_compressor=compressor,168 base_retriever=retriever,169)170```171~172### Cautare hibrida173~174```python175from langchain.retrievers import EnsembleRetriever176from langchain_community.retrievers import BM25Retriever177~178bm25_retriever = BM25Retriever.from_documents(chunks)179bm25_retriever.k = 4180~181semantic_retriever = vectorstore.as_retriever(search_kwargs={"k": 4})182~183hybrid_retriever = EnsembleRetriever(184 retrievers=[bm25_retriever, semantic_retriever],185 weights=[0.4, 0.6],186)187```188~189## Bune practici190~1911. **Alege dimensiunea corecta a fragmentului**: Experimenteaza cu dimensiuni diferite (500-1500 tokeni).1922. **Foloseste metadate ale documentelor**: Adauga sursa, data si categoria ca metadate.1933. **Evalueaza calitatea**: Foloseste framework-uri precum [RAGAS](https://docs.ragas.io/).1944. **Gestioneaza actualizarile documentelor**: Implementeaza un pipeline de re-ingestie.1955. **Adauga un re-ranker**: Dupa recuperarea initiala, foloseste un model de re-clasificare.196~197## Concluzie198~199RAG a devenit arhitectura standard pentru construirea aplicatiilor AI care au nevoie de acces la cunostinte specifice si actualizate. LangChain simplifica semnificativ implementarea.200~201**Pasii urmatori:**202- **Experimenteaza local**: Incepe cu ChromaDB si cateva documente.203- **Exploreaza LangSmith**: Foloseste [LangSmith](https://smith.langchain.com/) pentru monitorizare.204- **Incearca modele de embedding diferite**: Compara modele precum `text-embedding-3-small`, `text-embedding-3-large`.205- **Consulta documentatia**: [Documentatia LangChain](https://python.langchain.com/docs/) este excelenta.206~
NORMAL · rag-langchain-deep-dive.md [readonly]206 lines · :q to close