Item detail
github.com

antonbabenko/deliberation

antonbabenko/deliberation is a mit multi-model arbiter mcp serv that RepoRadar is tracking in its antonbabenko/deliberation is the MIT deliberatio section, currently rated Silver tier with a 'try now' verdict. Its strongest signal is novelty, scored 9.0 out of 10.

Score7.8
Popularity106.0
Risknone
TierSilver
Score breakdown
Usefulness8.0
Novelty9.0
Momentum8.0
Maturity7.9
Open-source/build8.4
Evidence8.0
Workflow potential8.9
Setup ease8.8

Popularity is tracked separately. Support, ads, sponsorships, and tips never affect these signals.

Why it matters

Useful for **AI coding agents whose calling model is not enough for high-stakes decisions** — Deliberation lets Claude Code delegate to GPT, Gemini, Grok, or 400+ OpenRouter models with seven domain experts that debate until they agree, so a refactor / security review / architectural decision gets second and third opinions from different training lineages. Useful for **code-review workflows** — `/

Who should use it

**AI coding agents whose calling model is not enough for high-stakes decisions** — Deliberation lets Claude Code delegate to GPT, Gemini, Grok, or 400+ OpenRouter models with seven domain experts that debate until they agree, so a refactor / security review / architectural decision gets second and third opinions from different training lineages**Code-review workflows** — `/consensus` returns the same prompt's top findings from N models with a disagreement matrix showing where they diverge, so a reviewer can see 'GPT thinks X, Gemini thinks Y, Grok thinks Z' in one call**Architect-debate workflows** — `/ask-all` runs a 2-round debate where three models return their top findings, then each model critiques the others' picks, so a real disagreement surfaces and the conclusion is grounded in the critique**Security reviews** — the Security Analyst expert can be invoked alone or as part of a consensus round, and the model can implement the fix (write mode) or just analyze (read-only mode)**AI-tool teams that want multi-model consensus without writing glue code** — the plugin handles the wiring for each provider, the OpenRouter config hot-reloads, and the disagreement matrix is the default output format**Researchers studying model disagreement** — the synthesized responses, disagreement matrices, and round-by-round transcripts are exactly the kind of artifact a model-disagreement study needs**Cursor / VS Code / Kiro / OpenCode users** — one-click install buttons for each IDE, plus the standalone npm MCP server works in any MCP clientEvaluation: `/plugin marketplace add antonbabenko/agent-plugins` (Claude Code) + `/plugin install deliberation@antonbabenko` + `/deliberation:setup` to register the MCP servers; or `npx -y @antonbabenko/deliberation-mcp` for any MCP client; the README walks through the seven experts, the three modes, the per-provider API keys, and the OpenRouter config hot-reload

Who should skip it

Skip antonbabenko/deliberation if the source repository or demo is inactive, unmaintained, or no longer matches the description shown here.

About this signal

antonbabenko/deliberation is tracked by RepoRadar as a mit multi-model arbiter mcp serv in the antonbabenko/deliberation is the MIT deliberatio section. It was first seen on 2026-06-25 and last updated on 2026-06-25. The current verdict is 'try now' with a Silver tier and easy setup difficulty. Across RepoRadar's eight signals, antonbabenko/deliberation is strongest on novelty (9.0) and workflow potential (8.9) and weakest on maturity (7.9) — a profile worth weighing against your own priorities. This page summarizes the evidence RepoRadar has captured from captured source metadata. The score, tier, risk label, and verdict on this page are never influenced by sponsorship, ads, or tips — they reflect only the usefulness, popularity, novelty, momentum, maturity, and evidence signals described in the RepoRadar methodology.

How this item is evaluated

RepoRadar assigned antonbabenko/deliberation a composite score of 7.8 out of 10, placing it in the Silver tier. This score combines weighted sub-signals: usefulness (35%), novelty (18%), momentum (14%), maturity (10%), open-source/build quality (7%), evidence quality (6%), workflow potential (6%), and setup ease (4%). Popularity is tracked separately at 106.0 and never affects the composite score or tier. The risk label of 'none' reflects inherent user-impacting hazards, not generic novelty. Items with no risk flag may still require normal code review before production use.

Putting this into practice? Read How to vet an AI agent or MCP server before you wire it in for the checklist behind this score.

Risk explanation

No inherent user-impacting risk is flagged from the captured evidence.

Evidence links

Closest alternatives / related signals

deliberationantonbabenkomulti-modelmulti-model-consensusmodel-arbitersecond-opinionthird-opiniontraining-lineage