Item detail

sapientinc/HRM-Text

sapientinc/HRM-Text is a open 1b hrm-architecture text pr in RepoRadar's Apache-2.0 1B text generation model + full pretr section, holding Gold tier and a 'try now' verdict. Its strongest signal is workflow potential, scored 9.3 out of 10.

Score8.2
Popularity1552.0
Risklow
TierGold
Score breakdown
Usefulness8.0
Novelty9.0
Momentum8.0
Maturity8.5
Open-source/build8.4
Evidence7.2
Workflow potential9.3
Setup ease4.2

Popularity is tracked separately. Support, ads, sponsorships, and tips never affect these signals.

Why it matters

Useful for research groups and engineering teams who want to pretrain a foundation model from scratch without a frontier-lab compute budget: HRM-Text is the Apache-2.0 1B text generation model + full pretraining framework from Sapient that runs L=0.6B on 8xH100 in ~50 hours (~$800) or XL=1B on 16xH100 in ~46 hours (~$1472) at $2/H100 hour; for academic research groups who need a reproducible, peer

Who should use it

Research groups and engineering teams who want to pretrain a foundation model from scratch without a frontier-lab compute budget: HRM-Text is the Apache-2.0 1B text generation model + full pretraining framework from Sapient that runs L=0.6B on 8xH100 in ~50 hours (~$800) or XL=1B on 16xH100 in ~46 hours (~$1472) at $2/H100 hourAcademic research groups who need a reproducible, peer-reviewed pretraining framework (arXiv 2605.20613) that drops compute by 130-600x and data by 150-900x vs comparable dense baselinesEngineering teams who want a hierarchical recurrent architecture (HRM) that strengthens task completion and latent space reasoning — a different inductive bias from a vanilla transformerML engineers adopting the published Docker image (`sapientai/hrm-text:latest`) and the companion `sapientinc/data_io` pipeline (clean, tokenize, stratified-sample) for a turnkey pretraining pathMulti-node training teams who want FSDP2 + FlashAttention 3 + PrefixLM sequence packing out of the box (the framework ships these as the canonical training path)Teams evaluating the HRM (Hierarchical Reasoning Model) architecture against vanilla transformers on text-only tasks (the L=0.6B reference run publishes the per-task numbers the team's port can be compared against)Organizations that want a small, cheap, fully open-source pretraining framework they can run on their own H100 cluster (no managed service, no per-token API cost)Researchers who want to reproduce the published reference runs before betting a research direction on the architecture (the per-benchmark numbers in the README are the validation surface)Engineering teams who need a checkpoint conversion path from the training framework to a deployable inference format (the framework ships the conversion tooling)Community-supported research work — the team runs a 1200+ developer Discord for support, the maintainer is responsive, and the paper is peer-publishable (arXiv 2605.20613)

Who should skip it

Hold off on sapientinc/HRM-Text if the setup requirements exceed what your current workflow or team can support without dedicated engineering time.

About this signal

sapientinc/HRM-Text is tracked by RepoRadar as a open 1b hrm-architecture text pr in the Apache-2.0 1B text generation model + full pretr section. It was first seen on 2026-06-25 and last updated on 2026-06-25. The current verdict is 'try now' with a Gold tier and hard setup difficulty. sapientinc/HRM-Text leads on workflow potential (9.3) and novelty (9.0); its lowest signal is setup ease (4.2), so factor that in before investing setup time. This page summarizes the evidence RepoRadar has captured from captured source metadata. The score, tier, risk label, and verdict on this page are never influenced by sponsorship, ads, or tips — they reflect only the usefulness, popularity, novelty, momentum, maturity, and evidence signals described in the RepoRadar methodology.

How this item is evaluated

RepoRadar assigned sapientinc/HRM-Text a composite score of 8.2 out of 10, placing it in the Gold tier. This score combines weighted sub-signals: usefulness (35%), novelty (18%), momentum (14%), maturity (10%), open-source/build quality (7%), evidence quality (6%), workflow potential (6%), and setup ease (4%). Popularity is tracked separately at 1552.0 and never affects the composite score or tier. The risk label of 'low' reflects inherent user-impacting hazards, not generic novelty. Items with no risk flag may still require normal code review before production use.

Risk explanation

**Hopper-class GPUs are the expected training target because the attention path depends on FlashAttention 3.** The README is explicit that 'Hopper-class GPUs are the expected training target because the attention path depends on FlashAttention 3.' Adopters running on Ampere (A100) or older will not get the published compute numbers; the L reference run on 8xH100 is the canonical surface. Verify the team's GPU class against `docker/Dockerfile` (which documents the tested versions) before betting a long run on a different cluster; **Published reference numbers are the maintainer's own measurements on the maintainer's own benchmark suite.** The published benchmark table (GSM8k 77.6% / 84.7%, MATH 51.2% / 56.5%, DROP 78.6% / 82.3%, MMLU 56.6% / 60.7%, ARC-C 75.9% / 81.9%, HellaSwag 52.7% / 63.4%, Winogrande 67.6% / 72.4%, BoolQ 85.0% / 86.2%) is the maintainer's own measurements on the maintainer's own data pipeline. Adopters who want to claim the architecture delivers the published lift on their domain should reproduce the L reference run on their cluster first, then port the architecture / data pipeline and compare against the published numbers; **Multi-node FSDP2 checkpointing needs shared storage; per-node only saves its own shard.** The README is explicit that 'each node only saves its own shard, so we recommend mounting a shared storage.' Adopters running multi-node XL on cluster-local storage (NFS not configured, parallel filesystem not mounted) will end up with a per-node partial checkpoint that cannot be reloaded as a single model. Verify the shared-storage layout (`/shared/HRM-Text/checkpoints/`) is mounted on every pretraining node before launching a long run.

Evidence links

Closest alternatives / related signals

hrm-texthrmsapientsapientinchierarchical-reasoning-modelhierarchical-recurrenthrm-architecturehrm-1b