Unified (ask-llm-mcp)
All providers in one MCP server. Auto-detects which CLIs are installed and registers only the available tools. One install, all providers.
Installation
Run in your terminal:
# Project scope (current project only)
claude mcp add ask-llm -- npx -y ask-llm-mcp
# User scope (all projects)
claude mcp add --scope user ask-llm -- npx -y ask-llm-mcp Or install as a plugin (adds slash commands like /multi-review, /brainstorm, /compare, plus reviewer subagents and a pre-commit hook):
/plugin marketplace add Lykhoyda/ask-llm
/plugin install ask-llm@ask-llm-pluginsOr install globally:
npm install -g ask-llm-mcpPrerequisites
- Node.js v20.0.0 or higher
- At least one provider installed and authenticated:
- Gemini CLI for
ask-geminitools - Codex CLI for
ask-codextools - Ollama running locally for
ask-ollamatools
- Gemini CLI for
How It Works
On startup, the unified server:
- Checks for CLI availability via
which(Gemini, Codex) - Checks for HTTP availability via health endpoints (Ollama)
- Dynamically imports and registers tools from available providers
- Exposes only the tools for providers that are actually installed
Tools
All tools from installed providers are registered. If you have all three:
| Tool | Purpose |
|---|---|
ask-llm | Single unified tool — picks the provider via provider parameter (gemini, codex, ollama). Optional sessionId for multi-turn continuation |
multi-llm | Dispatch the same prompt to multiple providers in parallel; returns per-provider responses + usage in one call |
get-usage-stats | Per-session token totals + breakdowns by provider/model — in-memory, no persistence |
diagnose | Self-diagnosis: Node version, PATH, provider CLI presence + versions. Read-only |
ping | Connection test |
The orchestrator uses a single ask-llm tool (not one per provider) for token efficiency — see ADR-029. All ask-* tools return both human-readable text and a structured AskResponse (provider, response, model, sessionId, usage) via MCP outputSchema.
It also exposes usage://current-session as an MCP Resource for live JSON snapshots of token spend.
CLI Subcommands
The ask-llm-mcp binary supports two CLI modes alongside the default MCP server:
npx ask-llm-mcp repl # interactive multi-provider REPL with sessions, usage tracking, slash commands
npx ask-llm-mcp doctor # diagnose Node version, PATH, provider CLIs, env vars (--json for machine output)Key Features
- Single server for all providers
- Auto-detection of installed CLIs
- Single unified
ask-llmtool for token efficiency - Multi-provider parallel dispatch via
multi-llm(Promise.all internally; per-provider failure isolation) - Session continuity across all 3 providers — Gemini (
--resume), Codex (exec resume), Ollama (server-side replay) - Graceful degradation if a provider is unavailable
npm
- Package: ask-llm-mcp
- Binary:
ask-llm-mcp