Getting Started
Three steps: install Node, install at least one provider, register the MCP server with your client. You can start with one provider (Gemini, Codex, or Ollama) and add the others anytime.
Step 1: Install Prerequisites
- Node.js v20.0.0 or higher (LTS 20 or 22).
- At least one provider — pick whichever fits your use case:
Which provider should I install first?
- Codex — strong code reasoning (GPT-5.5). The default workhorse for targeted reviews and architecture critique.
- Antigravity — subscription-backed via Google AI Pro/Ultra (
agy). The Gemini CLI successor; good for a second opinion and larger-context reads. - Ollama — local, private, zero cost. Best when data can't leave your machine.
- Gemini — huge 1M+ token context, but enterprise-gated from 2026-06-18.
# Codex (requires OpenAI account)
npm install -g @openai/codex
# follow the codex CLI's auth instructions
# Antigravity (requires Google AI Pro/Ultra)
# install agy from https://antigravity.google, then log in once
# Ollama
# install from https://ollama.com, then:
ollama pull qwen3.6:27b
# Gemini (enterprise seats only from 2026-06-18)
npm install -g @google/gemini-cli && gemini loginYou can install one or all three. The MCP server auto-detects which providers are available and only registers tools for the ones it finds.
Step 2: Configure Your MCP Client
The recommended package is ask-llm-mcp — the unified orchestrator that auto-detects all installed providers and exposes them through a single ask-llm MCP tool plus multi-llm, get-usage-stats, diagnose, and ping.
If you only want one provider, you can also install the per-provider packages directly: ask-gemini-mcp, ask-codex-mcp, ask-ollama-mcp. They expose provider-specific tools (ask-gemini with @ file syntax + sandbox + edit mode, ask-codex, ask-ollama).
Option A: Claude Code (Recommended)
# Unified — picks up all installed providers
claude mcp add --scope user ask-llm -- npx -y ask-llm-mcp
# Or per-provider (longer tool names, more granular control)
claude mcp add --scope user gemini -- npx -y ask-gemini-mcp
claude mcp add --scope user codex -- npx -y ask-codex-mcp
claude mcp add --scope user ollama -- npx -y ask-ollama-mcpOption B: Claude Desktop
Add to claude_desktop_config.json:
Where is my config file located?
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json - Linux:
~/.config/claude/claude_desktop_config.json
{
"mcpServers": {
"ask-llm": {
"command": "npx",
"args": ["-y", "ask-llm-mcp"]
}
}
}WARNING
You must restart Claude Desktop completely for changes to take effect.
Option C: Cursor / Warp / Copilot / generic STDIO
Ask LLM works with 40+ MCP-compatible clients. Standard STDIO config:
{
"command": "npx",
"args": ["-y", "ask-llm-mcp"]
}For Cursor specifically, this goes in .cursor/mcp.json. For Warp/Copilot, see your client's MCP integration docs.
Step 3: Verify Your Setup
Two ways to verify, depending on whether the MCP server is running:
From inside any MCP client — ask the assistant to call ping:
Use ask-llm ping to test the connectionFrom the terminal directly — run the doctor:
npx ask-llm-mcp doctorThe doctor checks Node version, PATH resolution, every provider CLI's presence and version, and key env vars. Use it when MCP itself can't start (server not registered, broken auth, wrong Node version) — it works outside the MCP transport.
If everything looks good, head to First Steps to send your first prompt, or How to Ask for usage patterns. Need every install method and per-client config? See Installation.
Optional: Interactive REPL
The orchestrator binary also exposes a multi-provider REPL — switch providers, persist sessions, see token usage live:
npx ask-llm-mcp replSlash commands include /provider <name>, /new (fresh session), /sessions, /usage, /help, /quit. Useful for quick sanity checks and side-by-side provider comparison without setting up an MCP client.
Advanced Configuration (Environment Variables)
You can configure the server with env vars in your MCP client's configuration block.
| Variable | Default | Description |
|---|---|---|
GMCPT_LOG_LEVEL | warn | Minimum log level: debug, info, warn, error. Bump to debug if troubleshooting. |
GMCPT_TIMEOUT_MS | (none) | Global wall-clock timeout override for subprocess-spawned providers. When set, lifts both per-provider defaults below. Kept for backward compatibility — prefer the per-provider knobs for finer control. |
ASK_CODEX_TIMEOUT_MS | 800000 | Codex-specific timeout (13.3 min). Codex with reasoning models (gpt-5.5 family) runs multi-turn tool-use loops where each turn includes reasoning, so substantive prompts routinely take 5–10 min. Default raised in ADR-074 (closes #45). |
ASK_GEMINI_TIMEOUT_MS | 210000 | Gemini-specific timeout (3.5 min). Gemini's stream-json mode emits tokens incrementally, so the existing default is usually adequate. Provided for symmetry with ASK_CODEX_TIMEOUT_MS. |
OLLAMA_HOST | http://localhost:11434 | Ollama server URL. Override if running Ollama elsewhere. |
ASK_LLM_PATH | (auto) | Override the resolved PATH used to find provider CLIs. Auto-resolved from your login shell on macOS GUI clients (ADR-047) — only set explicitly if your shell setup is unusual. |