Wondermotor AI

Open tools for running local AI

tq
Available

Manager for TurboQuant+ by TheTom — run local LLMs with auto-configured KV cache compression

Install — macOS (Apple Silicon) & Windows (CUDA)
$ curl -fsSL https://wondermotor.com/install.sh | bash

3.8x KV Compression

turbo4 compresses V cache to 4.25 bits — 8x more context in the same RAM.

Auto-Config

Detects GPU, RAM, quant type. Calculates optimal cache settings automatically.

OpenAI-Compatible

/v1/chat/completions. Works with any client — curl, Pi, LM Studio.

Idle Auto-Stop

Stops after 5 min idle. Like Ollama — no wasted resources.

Without vs With tq
WithoutWith tq
8GB RAM, 3B model4,096 context32,768 context
KV cache memoryf16 (full size)3.8× compressed
ConfigManual flagsAuto-detected
Idle behaviorAlways runningAuto-stops
Quick Start
  1. Install: curl -fsSL https://wondermotor.com/install.sh | bash
  2. List models: tq list
  3. Search & download: tq search "qwen2.5 coder 7b" then tq download <model>
  4. Serve: tq serve 1 — TurboQuant auto-configured.
Commands
CommandDescription
tq listList local GGUF models
tq search <query>Search HuggingFace
tq download <model>Download with SHA256 verify
tq serve 1Launch with auto TQ config
tq serve 1 --dry-runPreview command
tq statusCheck running server
tq stopStop the server
tq logsView server logs
tq installInstall TurboQuant+ binary
tq doctorVerify setup
tq config showShow/edit config

More tools coming soon

Wondermotor AI is building more open tools for local AI. Stay tuned.