Score breakdown
Popularity is tracked separately. Support, ads, sponsorships, and tips never affect these signals.
Why it matters
Useful for research groups and engineering teams who want to pretrain a foundation model from scratch without a frontier-lab compute budget: HRM-Text is the Apache-2.0 1B text generation model + full pretraining framework from Sapient that runs L=0.6B on 8xH100 in ~50 hours (~$800) or XL=1B on 16xH100 in ~46 hours (~$1472) at $2/H100 hour; for academic research groups who need a reproducible, peer
Who should use it
Who should skip it
Hold off on sapientinc/HRM-Text if the setup requirements exceed what your current workflow or team can support without dedicated engineering time.
About this signal
sapientinc/HRM-Text is tracked by RepoRadar as a open 1b hrm-architecture text pr in the Apache-2.0 1B text generation model + full pretr section. It was first seen on 2026-06-25 and last updated on 2026-06-25. The current verdict is 'try now' with a Gold tier and hard setup difficulty. sapientinc/HRM-Text leads on workflow potential (9.3) and novelty (9.0); its lowest signal is setup ease (4.2), so factor that in before investing setup time. This page summarizes the evidence RepoRadar has captured from captured source metadata. The score, tier, risk label, and verdict on this page are never influenced by sponsorship, ads, or tips — they reflect only the usefulness, popularity, novelty, momentum, maturity, and evidence signals described in the RepoRadar methodology.
How this item is evaluated
RepoRadar assigned sapientinc/HRM-Text a composite score of 8.2 out of 10, placing it in the Gold tier. This score combines weighted sub-signals: usefulness (35%), novelty (18%), momentum (14%), maturity (10%), open-source/build quality (7%), evidence quality (6%), workflow potential (6%), and setup ease (4%). Popularity is tracked separately at 1552.0 and never affects the composite score or tier. The risk label of 'low' reflects inherent user-impacting hazards, not generic novelty. Items with no risk flag may still require normal code review before production use.
Risk explanation
**Hopper-class GPUs are the expected training target because the attention path depends on FlashAttention 3.** The README is explicit that 'Hopper-class GPUs are the expected training target because the attention path depends on FlashAttention 3.' Adopters running on Ampere (A100) or older will not get the published compute numbers; the L reference run on 8xH100 is the canonical surface. Verify the team's GPU class against `docker/Dockerfile` (which documents the tested versions) before betting a long run on a different cluster; **Published reference numbers are the maintainer's own measurements on the maintainer's own benchmark suite.** The published benchmark table (GSM8k 77.6% / 84.7%, MATH 51.2% / 56.5%, DROP 78.6% / 82.3%, MMLU 56.6% / 60.7%, ARC-C 75.9% / 81.9%, HellaSwag 52.7% / 63.4%, Winogrande 67.6% / 72.4%, BoolQ 85.0% / 86.2%) is the maintainer's own measurements on the maintainer's own data pipeline. Adopters who want to claim the architecture delivers the published lift on their domain should reproduce the L reference run on their cluster first, then port the architecture / data pipeline and compare against the published numbers; **Multi-node FSDP2 checkpointing needs shared storage; per-node only saves its own shard.** The README is explicit that 'each node only saves its own shard, so we recommend mounting a shared storage.' Adopters running multi-node XL on cluster-local storage (NFS not configured, parallel filesystem not mounted) will end up with a per-node partial checkpoint that cannot be reloaded as a single model. Verify the shared-storage layout (`/shared/HRM-Text/checkpoints/`) is mounted on every pretraining node before launching a long run.