Orchestration

MAUL

Proprietary. MAUL is proprietary. This is an overview of how it works, not a license to the code or weights.

MAUL (Multi-Agentic Unified LLM) sits between you and Hydrogen. It figures out what you meant, sends the question to the right specialist, and comes back with one answer.

The idea

The hypothesis: five specialists each doing one thing well beats one model doing everything. Math to a solver. Code to a sandbox. Each specialist handles what it's built for. MAUL is how we test it.

Long-term plan: train actual separate specialist models, not just specialist prompts. Route between trained models.

Versions

MAUL-1 with Hydrogen-1: 95.68% on GSM8K in our paired test run.

MAUL-2 with Hydrogen-2: 95.90% on GSM8K. The pairing we use for Hydrogen-2 today.

How routing works

Your message goes in. MAUL reads what you actually meant: semantic parse, not keyword matching. Then it picks a path. Hard math goes to SymPy (the real solver, not an LLM guess), code runs in a sandbox, conversation goes to MicroSpecialist. Outputs get merged. You get one reply.

What MAUL-2 does differently

Math routes to SymPy, not an LLM prediction. Code runs in an actual sandbox. Routing parses what you meant before picking a specialist, not after.

Hydrogen-2 model page →