Score breakdown
Popularity is tracked separately. Support, ads, sponsorships, and tips never affect these signals.
Why it matters
Useful for **AI safety + cyber-security researchers measuring AI agents' offensive capabilities** — ExploitGym is the largest public benchmark (869 instances, 3 vulnerability surfaces) for evaluating AI agents' ability to develop exploits against real-world vulnerabilities, with a verifiable arXiv paper and a who's-who author list. Useful for **defensive security teams wanting to measure their AI
Who should use it
Who should skip it
Pass on sunblaze-ucb/exploitgym if you need something non-technical and turnkey rather than a tool that requires comfort with CLI, dependencies, or system configuration.
About this signal
sunblaze-ucb/exploitgym is tracked by RepoRadar as a apache-2.0 large-scale realistic in the sunblaze-ucb/exploitgym is the Apache-2.0 Exploi section. It was first seen on 2026-06-26 and last updated on 2026-06-26. The current verdict is 'try now' with a Gold tier and hard setup difficulty. Across RepoRadar's eight signals, sunblaze-ucb/exploitgym is strongest on workflow potential (9.3) and novelty (9.0) and weakest on setup ease (4.2) — a profile worth weighing against your own priorities. This page summarizes the evidence RepoRadar has captured from captured source metadata. The score, tier, risk label, and verdict on this page are never influenced by sponsorship, ads, or tips — they reflect only the usefulness, popularity, novelty, momentum, maturity, and evidence signals described in the RepoRadar methodology.
How this item is evaluated
RepoRadar assigned sunblaze-ucb/exploitgym a composite score of 8.2 out of 10, placing it in the Gold tier. This score combines weighted sub-signals: usefulness (35%), novelty (18%), momentum (14%), maturity (10%), open-source/build quality (7%), evidence quality (6%), workflow potential (6%), and setup ease (4%). Popularity is tracked separately at 47.0 and never affects the composite score or tier. The risk label of 'conditional' reflects inherent user-impacting hazards, not generic novelty. Items with no risk flag may still require normal code review before production use.
Putting this into practice? Read How to evaluate an AI tool before you adopt it for the checklist behind this score.
Risk explanation
ExploitGym is offensive-research context (measures AI exploit-development capability against real-world CVE-class vulnerabilities); the harness enforces outbound network isolation (Squid firewall); per-task Docker containers; and the README's docs/defenses.md explicitly documents the system-defense-disabling steps (ASLR.
