Score breakdown
Popularity is tracked separately. Support, ads, sponsorships, and tips never affect these signals.
Why it matters
Useful for researchers who want a benchmark-backed memory architecture to compare against their own long-horizon agent retrieval and provenance pipelines.
Who should use it
Who should skip it
Consider AutoTrustAI/PaperGuru-Benchmark lower priority if you already have a working solution in this category.
About this signal
AutoTrustAI/PaperGuru-Benchmark is tracked by RepoRadar as a memory benchmark in the Research and Evaluation section. It was first seen on 2026-06-25 and last updated on 2026-06-25. The current verdict is 'worth watch' with a Silver tier and advanced setup difficulty. The standout signals for AutoTrustAI/PaperGuru-Benchmark are open-source/build quality (8.4) and workflow potential (8.2), while setup ease (4.2) trails — that balance shapes where it fits best. This page summarizes the evidence RepoRadar has captured from captured source metadata. The score, tier, risk label, and verdict on this page are never influenced by sponsorship, ads, or tips — they reflect only the usefulness, popularity, novelty, momentum, maturity, and evidence signals described in the RepoRadar methodology.
How this item is evaluated
RepoRadar assigned AutoTrustAI/PaperGuru-Benchmark a composite score of 7.8 out of 10, placing it in the Silver tier. This score combines weighted sub-signals: usefulness (35%), novelty (18%), momentum (14%), maturity (10%), open-source/build quality (7%), evidence quality (6%), workflow potential (6%), and setup ease (4%). Popularity is tracked separately at 1.0 and never affects the composite score or tier. The risk label of 'conditional' reflects inherent user-impacting hazards, not generic novelty. Items with no risk flag may still require normal code review before production use.
Putting this into practice? Read How to read AI benchmarks without getting fooled for the checklist behind this score.
Risk explanation
The bundled reproduction submissions inherit per-paper licensing and redistribution constraints, so teams should map those artifact-level terms before republishing the package; The headline benchmark lifts are maintainer-reported, so treat it as a strong research baseline and reproduce the numbers before citing it as production evidence.
