spinny:~/writing $ vim rag-langchain-deep-dive.md
1~2Velke jazykove modely (LLM) jako GPT-4 a Claude jsou mimoradne mocne, ale trpi zakladnim omezenim: jejich znalosti jsou zmrazeny v dobe treninku. **Retrieval-Augmented Generation (RAG)** resi presne tento problem kombinaci generativni sily LLM s moznosti ziskavat informace z externich zdroju.3~4## Problem: Omezeni LLM5~61. **Staticke znalosti**: LLM vi pouze to, co videl behem treninku.72. **Halucinace**: Kdyz LLM nezna odpoved, ma tendenci si ji vymyslet.83. **Zadny pristup k soukromym datum**: Genericky LLM nema pristup k interni dokumentaci vasi spolecnosti.9~10## Co je RAG?11~12```mermaid13graph LR14 User["User"] -- "Question" --> Retriever15 Retriever -- "Search relevant\ndocuments" --> VectorStore["Vector Store"]16 VectorStore -- "Relevant\ndocuments" --> Retriever17 Retriever -- "Context + Question" --> LLM18 LLM -- "Grounded\nresponse" --> User19```20~21## Jak RAG funguje podrobne22~23### Faze 1: Indexace24~25```mermaid26graph TD27 A["Documents\n(PDF, HTML, MD, DB)"] --> B["Document Loader"]28 B --> C["Text Splitter"]29 C --> D["Text Chunks"]30 D --> E["Embedding Model"]31 E --> F["Numerical Vectors"]32 F --> G["Vector Store\n(ChromaDB, Pinecone, FAISS)"]33```34~35### Faze 2: Vyhledavani + Generovani36~37## Stavba RAG Pipeline s LangChain38~39```bash40pip install langchain langchain-openai langchain-community chromadb41```42~43```python44from langchain_community.document_loaders import PyPDFLoader, WebBaseLoader, DirectoryLoader, TextLoader45~46pdf_loader = PyPDFLoader("docs/manual.pdf")47pdf_docs = pdf_loader.load()48all_docs = pdf_docs49```50~51```python52from langchain.text_splitter import RecursiveCharacterTextSplitter53~54text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)55chunks = text_splitter.split_documents(all_docs)56```57~58```python59from langchain_openai import OpenAIEmbeddings60from langchain_community.vectorstores import Chroma61~62embedding_model = OpenAIEmbeddings(model="text-embedding-3-small")63vectorstore = Chroma.from_documents(documents=chunks, embedding=embedding_model, persist_directory="./chroma_db")64```65~66```python67from langchain_openai import ChatOpenAI68from langchain_core.prompts import ChatPromptTemplate69from langchain_core.runnables import RunnablePassthrough70from langchain_core.output_parsers import StrOutputParser71~72llm = ChatOpenAI(model="gpt-4o", temperature=0)73retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 4})74~75prompt = ChatPromptTemplate.from_template("""76Answer the question based only on the provided context.77If the context does not contain enough information, say you don't know.78~79Context:80{context}81~82Question: {question}83~84Answer:85""")86~87def format_docs(docs):88 return "\n\n".join(doc.page_content for doc in docs)89~90rag_chain = (91 {"context": retriever | format_docs, "question": RunnablePassthrough()}92 | prompt93 | llm94 | StrOutputParser()95)96```97~98## Pokrocile techniky RAG99~100Patri sem Multi-Query Retrieval, kontextualni komprese, hybridni vyhledavani a konverzacni RAG s pameti.101~102## Osvedcene postupy103~1041. Experimentujte s velikosti fragmentu (500-1500 tokenu).1052. Pouzivejte metadata dokumentu.1063. Vyhodnocujte kvalitu pomoci frameworku jako [RAGAS](https://docs.ragas.io/).1074. Implementujte pipeline pro aktualizaci dokumentu.1085. Pridejte re-ranker po pocatecnim vyhledavani.109~110## Zaver111~112RAG se stal standardni architekturou pro stavbu AI aplikaci s pristupem ke specifickym znalostem. LangChain vyrazne zjednodusuje implementaci.113~
NORMAL · rag-langchain-deep-dive.md [readonly]113 lines · :q to close