Frequently Asked Questions

General

What is Ask LLM?

A set of MCP servers that bridge your AI client (Claude Code, Claude Desktop, Cursor, Warp, Copilot — any of 40+ MCP-compatible clients) with up to three LLM providers running locally on your machine: Google Gemini, OpenAI Codex, and Ollama. Plus a Claude Code plugin layer with slash commands, reviewer subagents, and a pre-commit hook.

Why use this instead of the providers directly?

One interface for all three providers — you don't switch between three CLIs
Multi-provider parallel dispatch — multi-llm and /compare send the same prompt to multiple providers in one call
Verified code review — /multi-review cross-checks each finding against source before presenting (catches false positives)
Session continuity across providers — sessionId works the same way for all three
Built-in operational hardening — quota fallback, PATH resolution, stdin handling, stream-json output for live progressive content
Diagnostic surface — npx ask-llm-mcp doctor and the diagnose MCP tool tell you what's wrong before you have to investigate

Is it free?

The MCP servers are MIT-licensed and free. Provider costs depend on which provider you use:

Gemini — free tier via OAuth (gemini login), paid via API key
Codex — per OpenAI billing
Ollama — free (runs locally on your machine)

Does it work on Windows?

Yes for the per-provider packages. The orchestrator (ask-llm-mcp) and plugin should work too but get less Windows testing. Open an issue if you hit a Windows-specific bug.

Setup

Do I need all three providers?

No. The orchestrator (ask-llm-mcp) auto-detects which CLIs are installed and only registers tools for available providers. Install one and add others anytime.

What Node.js version do I need?

Node v20.0.0 or higher (LTS 20 or 22). The doctor command (npx ask-llm-mcp doctor) will warn if you're on an older version — Gemini CLI 0.36+ uses ES2024 regex features that crash on Node 18 (ADR-046).

Should I use the orchestrator or per-provider packages?

Choose orchestrator (`ask-llm-mcp`) when	Choose per-provider when
You want one MCP server, all providers	You only use one provider
You want `multi-llm` parallel dispatch	You need provider-specific tools (`ask-gemini-edit`, `fetch-chunk`)
You want the doctor + REPL CLI subcommands	You want shorter tool names without a `provider` parameter

Both can coexist — install whatever fits your workflow.

Can I use this with Claude Code?

Yes — that's the primary client. Use claude mcp add --scope user ask-llm -- npx -y ask-llm-mcp. The Claude Code plugin (/plugin marketplace add Lykhoyda/ask-llm) adds slash commands, subagents, and hooks on top.

Can I use this with Claude Desktop, Cursor, Warp, etc.?

Yes — any STDIO MCP client works. See Installation for client-specific config.

Usage

What's the `@` syntax?

A Gemini CLI feature for including files in prompts:

@file.js — single file
@src/*.js — multiple files
@**/*.ts — all TypeScript files
@. — current directory

It works with the ask-gemini tool (and the underlying Gemini CLI). Codex and Ollama don't have direct equivalents — quote or paste relevant code into the prompt instead.

How do multi-turn sessions work?

Every ask-* tool returns a session ID. Pass it back via the sessionId parameter to continue. Gemini and Codex use native CLI resume; Ollama uses server-side message replay. See Multi-Turn Sessions for details.

How do I send a prompt to multiple providers at once?

Use the multi-llm MCP tool:

text

Use multi-llm to send "is this thread-safe?" to Gemini and Codex

Or the /compare skill (Claude Code plugin) for the same thing with a nicer interactive surface. Or /multi-review if you want verified code review findings rather than raw responses.

Which model should I use?

Defaults are tuned per provider with auto-fallback. Don't override unless you have a specific reason. See Model Selection for the full breakdown.

Can I track my token spending?

Yes — every ask-* response includes result.structuredContent.usage with token counts, duration, and fallback flag. For per-session totals, call the get-usage-stats MCP tool, read the usage://current-session MCP Resource, or type /usage in the REPL.

Can I use this in CI/CD?

The MCP server is designed for interactive development. For CI workflows, you can call the executor packages directly via Node:

import { executeGeminiCLI } from "ask-gemini-mcp/executor";
const result = await executeGeminiCLI({ prompt: "review this diff" });

A dedicated GitHub Action was built and then withdrawn (ADR-061) — focus shifted to MCP/plugin usage for CLI workflows. CI integration may return as a future feature.

Privacy & Security

Is my code sent to Google / OpenAI?

Only when you explicitly use ask-gemini (Google) or ask-codex (OpenAI). For private code, use ask-ollama — it runs entirely locally and never makes external network calls.

Where do session files live?

Gemini sessions — managed by the Gemini CLI in its own storage (~/.gemini/)
Codex sessions — managed by Codex CLI in ~/.codex/
Ollama sessions — server-side replay store at /tmp/ask-llm-sessions/<id>.json, owner-only permissions (0o600 file / 0o700 dir), 24-hour TTL, atomic temp+rename writes (ADR-063)

The Ollama session permissions specifically prevent other users on shared systems from reading your prompts.

Are my API keys exposed?

Auth is delegated entirely to the provider CLIs (gemini login, Codex auth, Ollama doesn't need keys). The MCP server never reads, stores, or transmits your credentials — it just spawns the CLIs and the CLIs handle auth themselves.

Troubleshooting

Why is it slow?

Large-context Gemini calls are inherently slow (the model is reading a lot). The Gemini executor uses --output-format stream-json so you see progressive output rather than a frozen wait — incoming text is forwarded to MCP progress notifications (ADR-057). Codex at high reasoning effort can take several minutes — this is normal for the model, not a server issue.

Where do I get logs?

Set GMCPT_LOG_LEVEL=debug in your MCP client's env config to see verbose server logs. They go to stderr — Claude Desktop captures them at ~/Library/Logs/Claude/mcp-server-*.log (macOS).

My setup isn't working — what do I check first?

Run the doctor:

bash

npx ask-llm-mcp doctor

It checks Node version, PATH, every provider CLI's presence and version, and key env vars. It works even when MCP itself can't start. 90%+ of setup issues show up here as a clear failed-check line.

If the doctor says everything's fine but tool calls still fail, check the Troubleshooting page for specific error patterns.

Frequently Asked Questions ​

General ​

What is Ask LLM? ​

Why use this instead of the providers directly? ​

Is it free? ​

Does it work on Windows? ​

Setup ​

Do I need all three providers? ​

What Node.js version do I need? ​

Should I use the orchestrator or per-provider packages? ​

Can I use this with Claude Code? ​

Can I use this with Claude Desktop, Cursor, Warp, etc.? ​

Usage ​

What's the @ syntax? ​

How do multi-turn sessions work? ​

How do I send a prompt to multiple providers at once? ​

Which model should I use? ​

Can I track my token spending? ​

Can I use this in CI/CD? ​

Privacy & Security ​

Is my code sent to Google / OpenAI? ​

Where do session files live? ​

Are my API keys exposed? ​

Troubleshooting ​

Why is it slow? ​

Where do I get logs? ​

My setup isn't working — what do I check first? ​

More Questions? ​