Nvidia

NVIDIA AI Endpoints with LangChain

10 snippets

Step 7 - Chain with Retriever, Prompt, and Model

Wraps a restored FAISS vectorstore into a retriever, then builds a LangChain-style pipeline using a system prompt, question input, and model invocation.

retriever = store.as_retriever() prompt = ChatPromptTemplate.from_messages( [ ( "system", "Answer solely based on the following context:\n<Documents>\n{context}\n</Documents>", ), ("user",...

Step 6c - Load FAISS Index and Store

Loads previously saved FAISS index and pickle file.

import faiss import pickle index = faiss.read_index("./toy_data/nv_embedding.index") with open("./toy_data/nv_embedding.pkl", "rb") as f: store = pickle.load(f) store.index = index

Step 6b - Process into FAISS Vectorstore and Save

Splits text into chunks, embeds them, and stores into FAISS index + .pkl backup.

import faiss from operator import itemgetter from langchain.vectorstores import FAISS from langchain.core.output_parsers import StrOutputParser from langchain.core.prompts import ChatPromptTemplate from langchain.core.runnables import...

Step 6a - Speed Test for Embedding (1 vs. 10 Docs)

Measures embedding time for one document vs. a batch of 10 to check speed difference.

import time print("Single Document Embedding: ") s = time.perf_counter() e_embeddings = embedder.embed_documents([documents[0]]) elapsed = time.perf_counter() - s print('\033[1m' + f"Executed in {elapsed:0.2f} seconds." +...

Step 5 - Remove Empty Lines from Documents

Cleans a list of documents by removing lines that are just newlines (\n).

documents = [d for d in data if d is not '\n'] len(data), len(documents), data[0]

Step 4 - Load Text Files for Toy Dataset

Reads .txt files from the ./toy_data folder and prepares them for ingestion into a vectorstore.

import os from tqdm import tqdm from pathlib import Path # Here we read in the text data and prepare them into vectorstore ps = os.listdir("./toy_data/") data = [] sources = [] for p in ps: if p.endswith('.txt'): path2file =...

Step 3 - Initialize Embedding with nvolveqa_40k

Initializes the embedding model nvolveqa_40k from NVIDIA for vectorstore use.

from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings embedder = NVIDIAEmbeddings(model="nvolveqa_40k") # Alternatively, you can specify whether it will use the query or passage type # embedder = NVIDIAEmbeddings(model="nvolveqa_40k",...

Step 2 - Initialize LLM with mixtral_8x7b

Sets up and invokes the mixtral_8x7b model from LangChain NVIDIA endpoint using an API key.

# test run and see that you can generate a reponse successfully from langchain_nvidia_ai_endpoints import ChatNVIDIA llm = ChatNVIDIA(model="mixtral_8x7b", nvidia_api_key=nvapi_key) result = llm.invoke("Write a ballad about...

Step 1 - Install LangChain and Faiss

Installs langchain-core and faiss for running a local vectorstore RAG pipeline.

!pip install langchain-core==0.1.15 !pip install faiss-cpu # replace with faiss-gpu if you are using GPU

Step 1 - Set NVIDIA API KEY for LLM Access

Initializes the NVIDIA_API_KEY for querying models like mixtral_8x7b from NVIDIA AI Endpoint. Validates that the key starts with nvapi-.

import getpass import os ## API Key can be found by going to NVIDIA NGC -> AI Foundation Models -> (some model) -> Get API Code or similar. ## 10K free queries to any endpoint (which is a lot actually). # del os.environ['NVIDIA_API_KEY'] ##...