Item detail

kreuzberg-dev/kreuzberg

Kreuzberg is a Rust-core document intelligence framework that extracts text, metadata, tables, OCR results, and code structure from 96 file formats and 306 programming languages, with bindings for major runtimes plus CLI, REST, and MCP surfaces.

Score8.9
Popularity82.0
Riskconditional
TierGold
Score breakdown
Usefulness9.0
Novelty8.0
Momentum8.0
Maturity8.6
Open-source/build7.4
Evidence7.2
Workflow potential10.0
Setup ease6.4

Popularity is tracked separately. Support, ads, sponsorships, and tips never affect these signals.

Why it matters

Useful for teams building document-heavy AI workflows that need one serious extraction layer instead of a pile of single-format parsers and ad hoc OCR scripts.

Who should use it

RAG buildersdocument automation teamsMCP tool buildersdevelopers who need extraction across multiple languages

Who should skip it

Skip if the source link, docs, or setup requirements do not match your workflow.

Risk explanation

The project can ingest private documents and optionally connect to hosted OCR or LLM providers, so verify both the data path and the Elastic License 2.0 usage terms before adopting it in a commercial workflow..

Evidence links

Closest alternatives / related signals

document-intelligenceocrpdf-extractionmcprag