Multimod Models from NVIDIA AI Endpoints with LangChain Agent

A shared folder with AI prompts and code snippets

From workspace: Nvidia

Team: Main

Total snippets: 10

Nvidia

Multimod Models from NVIDIA AI Endpoints with LangChain Agent

10 snippets

Step 7 - Launch Gradio Interface for LangChain Agent

Wrap the LangChain agent inside a Gradio app so users can upload an image and get a caption response.

import gradio as gr ImageCaptionApp = gr.Interface(fn=agent, inputs=[gr.Image(label="Upload image", type="filepath")], outputs=[gr.Textbox(label="Caption")], title="Image Captioning with langchain agent", description="combine...

Step 6 - Test LangChain Agent with Image Input

Use the initialized agent to respond to a user query about the contents of a specific image.

user_question = "What is in this image?" img_path="./toy_data/jordan.png" response = agent.run(f'{user_question}, this is the image path: {img_path}') print(response)

Step 5 - Initialize LangChain Agent with Tools

Set up a LangChain agent with image captioning and chart parsing tools using memory and parsing configuration.

#initialize the gent tools = [ImageCaptionTool(), TabularPlotTool()] conversational_memory = ConversationBufferWindowMemory( memory_key='chat_history', k=5, return_messages=True ) agent = initialize_agent( ...

Step 4 - Deploy Image Caption Tool (Neva)

Wraps NVIDIA Neva API in a BaseTool class to return an image caption.

#Set up Prerequisites for Image Captioning App User Interface import os import io import IPython.display from PIL import Image import base64 import requests import gradio as gr from langchain.tools import BaseTool from transformers import...

Step 3 - Import Libraries for Image Captioning UI

Import all required Python packages for building the image captioning app with Gradio.

#Set up Prerequisites for Image Captioning App User Interface import os import io import IPython.display from PIL import Image import base64 import requests import gradio as gr

Step 3 - Mixtral 8x7b LLM Setup

Initialize ChatNVIDIA with the mixtral_8x7b model and a valid NVIDIA API key.

# test run and see that you can generate a respond successfully from langchain_nvidia_ai_endpoints import ChatNVIDIA llm = ChatNVIDIA(model="mixtral_8x7b", nvidia_api_key=nvapi_key)

Step 2 - Describe Image Using NeVA

Define prompt and path to use the downloaded image with the NeVA API response function.

img_path="./toy_data/jordan.png" prompt="describe the image" out=nv_api_response(prompt, img_path)

Step 2 - Download Test Image

Download a test image of sneakers to be used with the NeVA image response function.

!wget "https://docs.google.com/uc?export=download&id=1ZzPBBFkYu-jzz1iz3S6USkMk4nUN9vwv" -O ./toy_data/jordan.png

Step 2 - NeVA API Image Response

Wrap the NeVA API call into a function and generate a response by sending an image.

import openai, httpx, sys import base64, io from PIL import Image def img2base64_string(img_path): image = Image.open(img_path) if image.height > 800 or image.height > 800: image.thumbnail((800, 800)) buffered = io.BytesIO() ...

Step 1 - Export NVIDIA API Key

Checks if a valid NVIDIA API key is set in the environment and prompts the user to input it securely if not.

import getpass import os ## API Key can be found by going to NVIDIA NGC -> AI Foundation Models -> (some model) -> Get API Code or similar. ## 10K free queries to any endpoint (which is a lot actually). # del os.environ['NVIDIA_API_KEY'] ##...