Item detail

0xSero/glm-5.2-sm120

glm-5.2-sm120 is a source-visible Docker and vLLM serving recipe for 0xSero's GLM-5.2-NVFP4-REAP-469B on 4× RTX PRO 6000 Blackwell GPUs, with a published 250k-context configuration, fp8 KV cache guidance, DeepSeek sparse-attention tuning, MTP speculative decode, and smoke-tested launch scripts.

Score8.1
Popularity57.0
Riskmedium
TierSilver
Score breakdown
Usefulness8.0
Novelty8.0
Momentum6.0
Maturity7.1
Open-source/build7.4
Evidence7.2
Workflow potential8.5
Setup ease4.2

Popularity is tracked separately. Support, ads, sponsorships, and tips never affect these signals.

Why it matters

Useful for advanced inference teams evaluating whether giant open reasoning models can be served on prosumer Blackwell boxes without reinventing all of the kernel, KV-cache, and long-context tuning work themselves.

Who should use it

inference engineerslocal AI power users with high-end GPUsteams evaluating Blackwell workstation servingadvanced open-model infrastructure builders

Who should skip it

Skip or sandbox it if you cannot review permissions, data access, and failure modes before use.

Risk explanation

The repo does not declare a software license, so treat it as a reference until the maintainer adds explicit reuse terms; The documented path depends on 4× RTX PRO 6000 Blackwell GPUs, a third-party container image, and a 300+ GB model download, so verify trust, cost, and hardware fit before planning around it.

Evidence links

Closest alternatives / related signals

glm-5.2vllmblackwellinferencelong-context