Model Evaluation

Ornstein-Hermes-3.6-27B — SABER · Q5_K_M

eval by Kyle Hessling · model by @DJLougen at GestaltLabs

Agentic reasoning, production-grade front-end design, and canvas / WebGL creative coding. 14 runs, 71,587 completion tokens, self-hosted on a single RTX 5090. Refusal-shaped (SABER) Hermes 3.6 finetune over Qwen 3.5 27B.

Read the full report → Model on Hugging Face Follow @DJLougen (model creator) Follow @KyleHessling1

56.7avg tok/s

14runs

71,587completion tokens

~24 GBVRAM used

16Kcontext (f16 KV)

Web design · open to preview

SaaS landing pagePrism — AI observability

34.5 KB · 9762 tok · 172 s

Analytics dashboardLight theme, emerald accent

23.4 KB · 7707 tok · 144 s

Designer portfolioMaya Chen — kinetic typography

9.3 KB · 5862 tok · 103 s

Pricing page3 tiers + animated toggle + FAQ

20.3 KB · 5534 tok · 97 s

Mobile app marketingStillwater — CSS-only iPhone mock

14.1 KB · 8728 tok · 154 s

Canvas / WebGL · creative coding

Particle attractorCursor-attracted particle swarm

5.5 KB · 2282 tok · 40 s

Generative flow fieldPerlin-driven agent motion

10.9 KB · 5649 tok · 99 s

Three.js crystal sceneTransmissive glass + bloom

11.8 KB · 6839 tok · 120 s

Audio-reactive visualizerMic + oscillator fallback

10.7 KB · 6352 tok · 112 s

Agentic reasoning · text output

Code debug (4 bugs)k-th smallest element

no-think: 1039 tok · 18 s

Multi-step planningURL shortener deploy plan

no-think: 8000 tok · 140 s

Self-critique loopPalindrome · O(n³) → O(n²)

no-think: 1521 tok · 27 s

Structured JSON extractionCalendar + roster from prose

no-think: 1200 tok · 21 s

Tool-use planningWeather + flights + hotel

no-think: 1112 tok · 20 s