Overview
VMSC ships with a fully local AI assistant powered by small language models (Gemma 3 1B or Qwen 2.5 1.5B). Unlike cloud-based AI services, the assistant runs entirely on your machine — no data leaves your computer, and it works offline once the model is downloaded.
The AI assistant has access to VMSC's internal tools, allowing it to do more than just answer questions. It can create and modify rules, fire test events, navigate between pages, and inspect your current configuration — all through natural language conversation.
The AI Assistant is exclusively available in the Premium edition. Free users can view the AI page but cannot download models or use the chat interface.
System Requirements
The AI assistant uses GPU acceleration to run inference at interactive speeds. Here are the requirements:
| Component | Minimum | Recommended |
|---|---|---|
| RAM | 8 GB | 16 GB or more |
| VRAM (GPU Memory) | 2 GB | 4 GB or more |
| GPU | Vulkan-capable (AMD/NVIDIA/Intel) | NVIDIA GPU with CUDA support |
| Disk Space | 800 MB (Gemma 3 1B) | 1.5 GB (Qwen 2.5 1.5B) |
Some integrated GPUs (Intel UHD, AMD Radeon Vega) support Vulkan but have limited VRAM. The AI will run but may be noticeably slower. For best results, use a dedicated GPU.
Model Management
VMSC supports multiple AI models. You can download, switch, and manage them from the AI page.
Built-in Models
| Model | Size | Strengths |
|---|---|---|
| Gemma 3 1B | ~800 MB | Default model. Fast inference, low VRAM usage, good general capabilities. |
| Qwen 2.5 1.5B | ~1.2 GB | Slightly larger. Stronger reasoning and tool-use accuracy at the cost of more VRAM. |
Downloading a Model
- Navigate to the AI page in VMSC.
- Select a model from the dropdown. If it has not been downloaded yet, a Download button appears.
- Click Download. The model file is fetched from the Vryionic CDN and saved to your local app data directory. A progress bar shows the download status.
- Once complete, the model is loaded automatically and the chat interface becomes available.
Custom GGUF Models
Advanced users can load any GGUF-format model file. To use a custom model:
- Place your
.gguffile anywhere on your machine. - In the AI page, click Load Custom Model and select the file.
- VMSC will load the model using the same inference engine. System prompts and tool definitions are applied automatically.
Custom models must be in GGUF format and small enough to fit in your available VRAM. Very large models (7B+) may cause out-of-memory errors or extremely slow inference. Stick to models under 4B parameters for the best experience.
Chat Interface
The AI page features a conversational chat interface. You can ask the assistant questions in plain English and it will respond with helpful information about VMSC, your configuration, and streaming in general.
Example prompts you can try:
- "How do I set up PiShock with VMSC?"
- "Create a rule that sends a VRChat chatbox message when someone gifts a Rose"
- "What events does VMSC support?"
- "Fire a test gift event from user TestViewer"
- "Take me to the overlays page"
The chat history persists for the duration of your session. Each conversation thread maintains context so the assistant remembers earlier messages and can refine its responses based on follow-up questions.
AI Tools
The AI assistant has access to a set of internal tools that allow it to take actions inside VMSC on your behalf. When the assistant decides to use a tool, it will describe what it is doing and show the result.
| Tool | Description |
|---|---|
| Create Rule | Generates a complete rule with triggers, conditions, and actions based on your natural-language description. |
| Fire Event | Simulates a stream event (gift, follow, like, chat message) to test your rules without a live connection. |
| Navigate | Switches the VMSC interface to a specific page (Rules, Overlays, Settings, etc.). |
| Get Config | Reads your current VMSC configuration, rules, or output settings to provide context-aware help. |
| List Events | Shows all available event types that can be used in rule triggers. |
When the AI uses a tool, you will see a tool-call indicator in the chat showing which tool was invoked and what parameters were used. The assistant never takes destructive actions silently.
CUDA Acceleration
By default, VMSC uses the Vulkan backend for GPU inference, which works across NVIDIA, AMD, and Intel GPUs. However, if you have an NVIDIA GPU, you can enable CUDA acceleration for significantly faster inference speeds.
Installing CUDA Support
-
Check Your GPU
CUDA requires an NVIDIA GPU with compute capability 5.0 or higher. Most NVIDIA GPUs from the GTX 900 series onward are supported.
-
Install CUDA Toolkit
Download and install the NVIDIA CUDA Toolkit (version 12.x recommended). You only need the runtime libraries; the full development toolkit is not required.
-
Enable CUDA in VMSC
In the AI page settings, toggle Use CUDA to on. VMSC will restart the inference engine with the CUDA backend. If CUDA is not detected, VMSC will display an error and fall back to Vulkan.
Vulkan Fallback
If CUDA is not available or fails to initialize, VMSC automatically falls back to the Vulkan backend. This happens transparently — you will see a notification in the AI page indicating which backend is active. Vulkan is slower than CUDA but works on a much wider range of hardware.
| Backend | GPU Support | Relative Speed |
|---|---|---|
| CUDA | NVIDIA only | Fastest (2–3x faster than Vulkan) |
| Vulkan | NVIDIA, AMD, Intel | Baseline |
Tips and Limitations
The AI assistant is a small local model, not a large cloud-hosted LLM. Keep these practical tips in mind:
- Be specific — Clear, direct prompts produce better results. Instead of "make something cool," try "create a rule that triggers a VRChat chatbox message saying 'Thank you!' when someone follows."
- One task at a time — The assistant handles single tasks best. Break complex requests into smaller steps.
- Check the output — Always review rules and configurations created by the AI before using them in a live stream. The assistant can make mistakes, especially with complex condition logic.
- Context window — The model has a limited context window. Very long conversations may lose earlier context. Start a new conversation if the assistant seems to forget earlier instructions.
- No internet access — The AI runs entirely locally. It cannot fetch live data, check APIs, or access external documentation.
VMSC updates may include improved or additional models. When a better model becomes available, you can download it from the AI page without reinstalling the application.