Overview

VMSC ships with a fully local AI assistant powered by small language models (Gemma 3 1B or Qwen 2.5 1.5B). Unlike cloud-based AI services, the assistant runs entirely on your machine — no data leaves your computer, and it works offline once the model is downloaded.

The AI assistant has access to VMSC's internal tools, allowing it to do more than just answer questions. It can create and modify rules, fire test events, navigate between pages, and inspect your current configuration — all through natural language conversation.

Premium Feature

The AI Assistant is exclusively available in the Premium edition. Free users can view the AI page but cannot download models or use the chat interface.

System Requirements

The AI assistant uses GPU acceleration to run inference at interactive speeds. Here are the requirements:

Component Minimum Recommended
RAM 8 GB 16 GB or more
VRAM (GPU Memory) 2 GB 4 GB or more
GPU Vulkan-capable (AMD/NVIDIA/Intel) NVIDIA GPU with CUDA support
Disk Space 800 MB (Gemma 3 1B) 1.5 GB (Qwen 2.5 1.5B)
Integrated Graphics

Some integrated GPUs (Intel UHD, AMD Radeon Vega) support Vulkan but have limited VRAM. The AI will run but may be noticeably slower. For best results, use a dedicated GPU.

Model Management

VMSC supports multiple AI models. You can download, switch, and manage them from the AI page.

Built-in Models

Model Size Strengths
Gemma 3 1B ~800 MB Default model. Fast inference, low VRAM usage, good general capabilities.
Qwen 2.5 1.5B ~1.2 GB Slightly larger. Stronger reasoning and tool-use accuracy at the cost of more VRAM.

Downloading a Model

  1. Navigate to the AI page in VMSC.
  2. Select a model from the dropdown. If it has not been downloaded yet, a Download button appears.
  3. Click Download. The model file is fetched from the Vryionic CDN and saved to your local app data directory. A progress bar shows the download status.
  4. Once complete, the model is loaded automatically and the chat interface becomes available.

Custom GGUF Models

Advanced users can load any GGUF-format model file. To use a custom model:

  1. Place your .gguf file anywhere on your machine.
  2. In the AI page, click Load Custom Model and select the file.
  3. VMSC will load the model using the same inference engine. System prompts and tool definitions are applied automatically.
Custom Model Compatibility

Custom models must be in GGUF format and small enough to fit in your available VRAM. Very large models (7B+) may cause out-of-memory errors or extremely slow inference. Stick to models under 4B parameters for the best experience.

Chat Interface

The AI page features a conversational chat interface. You can ask the assistant questions in plain English and it will respond with helpful information about VMSC, your configuration, and streaming in general.

Example prompts you can try:

  • "How do I set up PiShock with VMSC?"
  • "Create a rule that sends a VRChat chatbox message when someone gifts a Rose"
  • "What events does VMSC support?"
  • "Fire a test gift event from user TestViewer"
  • "Take me to the overlays page"

The chat history persists for the duration of your session. Each conversation thread maintains context so the assistant remembers earlier messages and can refine its responses based on follow-up questions.

AI Tools

The AI assistant has access to a set of internal tools that allow it to take actions inside VMSC on your behalf. When the assistant decides to use a tool, it will describe what it is doing and show the result.

Tool Description
Create Rule Generates a complete rule with triggers, conditions, and actions based on your natural-language description.
Fire Event Simulates a stream event (gift, follow, like, chat message) to test your rules without a live connection.
Navigate Switches the VMSC interface to a specific page (Rules, Overlays, Settings, etc.).
Get Config Reads your current VMSC configuration, rules, or output settings to provide context-aware help.
List Events Shows all available event types that can be used in rule triggers.
Transparent Tool Use

When the AI uses a tool, you will see a tool-call indicator in the chat showing which tool was invoked and what parameters were used. The assistant never takes destructive actions silently.

CUDA Acceleration

By default, VMSC uses the Vulkan backend for GPU inference, which works across NVIDIA, AMD, and Intel GPUs. However, if you have an NVIDIA GPU, you can enable CUDA acceleration for significantly faster inference speeds.

Installing CUDA Support

  1. Check Your GPU

    CUDA requires an NVIDIA GPU with compute capability 5.0 or higher. Most NVIDIA GPUs from the GTX 900 series onward are supported.

  2. Install CUDA Toolkit

    Download and install the NVIDIA CUDA Toolkit (version 12.x recommended). You only need the runtime libraries; the full development toolkit is not required.

  3. Enable CUDA in VMSC

    In the AI page settings, toggle Use CUDA to on. VMSC will restart the inference engine with the CUDA backend. If CUDA is not detected, VMSC will display an error and fall back to Vulkan.

Vulkan Fallback

If CUDA is not available or fails to initialize, VMSC automatically falls back to the Vulkan backend. This happens transparently — you will see a notification in the AI page indicating which backend is active. Vulkan is slower than CUDA but works on a much wider range of hardware.

Backend GPU Support Relative Speed
CUDA NVIDIA only Fastest (2–3x faster than Vulkan)
Vulkan NVIDIA, AMD, Intel Baseline

Tips and Limitations

The AI assistant is a small local model, not a large cloud-hosted LLM. Keep these practical tips in mind:

  • Be specific — Clear, direct prompts produce better results. Instead of "make something cool," try "create a rule that triggers a VRChat chatbox message saying 'Thank you!' when someone follows."
  • One task at a time — The assistant handles single tasks best. Break complex requests into smaller steps.
  • Check the output — Always review rules and configurations created by the AI before using them in a live stream. The assistant can make mistakes, especially with complex condition logic.
  • Context window — The model has a limited context window. Very long conversations may lose earlier context. Start a new conversation if the assistant seems to forget earlier instructions.
  • No internet access — The AI runs entirely locally. It cannot fetch live data, check APIs, or access external documentation.
Model Improvements

VMSC updates may include improved or additional models. When a better model becomes available, you can download it from the AI page without reinstalling the application.