This tech demo allows anyone with an RTX GPU to enjoy a powerful, customizable GPT chatbot.
february 13, 2024 by jesse clayton
Powered by NVIDIA GPU-based servers in the cloud, millions of people around the world use chatbots every day. Now these groundbreaking tools will be available on Windows PCs powered by NVIDIA RTX to enable local, fast, information-tailored generative AI.
A demo of Chat with RTX technology is now available for free** and allows users to build their own chatbot locally on a GPU (at least 8GB of VRAM) running NVIDIA GeForce RTX 30 Series or higher.
Ask me everything. Chat with RTX brings generative AI capabilities to GeForce-powered native Windows PCs with Retrieval Enhanced Generation (RAG), NVIDIA TensorRT-LLM software, and NVIDIA RTX acceleration. Users can quickly and easily turn data on their desktops as datasets and connect to open-source large language models like Mistral or Llama 2 to quickly query contextually relevant content.
Users can simply enter their query without having to search for notes or saved content. For example, a user can ask: Which Las Vegas restaurant does my partner recommend? The Chat with RTX tool scans the desktop material the user points to and provides contextual answers.
Chat with RTX tool support includestxt、.pdf、.doc/.docx andXML and other file formats. Point your application to the folder that contains these materials, and the Chat with RTX tool will load them into your library in seconds.
Users can also add information from YouTube and lists. By adding *** to Chat with RTX, users will be able to integrate this knowledge with the chatbot for contextual queries. For example, ask for recommended travel content based on the best content posted by your favorite travel KOL, or get quick tutorials and how-to tips from top educational resources.
Chat with RTX adds knowledge of YouTube video content to the query results. Chat with RTX can run on-premises on Windows RTX PCs and workstations, plus the user's data remains on the local device, so query results can be delivered quickly. Unlike LLM services that rely on the cloud, Chat with RTX allows users to work on sensitive information on their PC desktops without having to share them with third parties or connect to the Internet.
In addition to a GeForce RTX 30 Series GPU or higher (with at least 8GB of VRAM), Chat with RTX requires Windows 10 or 11 and the latest NVIDIA GPU drivers.
Editor's note: There is currently an issue in Chat with RTX where the installation fails when the user selects a different installation directory. We will fix this in a future release. Currently users should use the default installation directory (C:UsersAppDataLocalNvidiaChatWithRTX).
Developing LLM-based applications with RTXChat with RTX shows the potential to accelerate LLMs with RTX GPUs. The app was built using the Tensorrt-llm RAG developer reference project on GitHub. Developers can use this reference project to develop and deploy private RAG-based applications for RTX, accelerated by TensorRT-LLM. Learn more about how to build LLM-based applications.