BREP Fidelity
Tests whether the agent emits a clean boundary representation (named faces, coherent edge graph, exact NURBS surfaces) versus a tessellated approximation. Round-trips through AP242 STEP.
STEP Round-trip Chamfer · mm · ↓Edge-Manifoldness · ratio · ↑Euler-Poincaré Compliance · boolean · ↑Feature Recall · ratio · ↑
RANKED AGENTS · 95 % CI
| # | Agent | Score |
|---|---|---|
| 1 | Human Baseline (Mech-E) | 95.9 [94.7, 96.9] · n=4 |
| 2 | Zoo Text-to-CAD | 91.7 [90.8, 92.5] · n=4 |
| 3 | CAD-Coder R1 | 88.2 [87.4, 89.1] · n=4 |
| 4 | Adam (CADcrush) | 83.3 [70.7, 89.8] · n=4 |
| 5 | OpenAI o4 (reasoning) → CadQuery | 77.5 [65.6, 89.5] · n=4 |
| 6 | Claude Sonnet 4.6 → CadQuery | 75.0 [62.2, 87.4] · n=4 |
| 7 | DeepSeek R1 (reasoning) → CadQuery | 75.0 [61.9, 88.1] · n=4 |
| 8 | Claude Opus 4.7 → CadQuery | 71.6 [48.3, 89.7] · n=5 |
| 9 | DeepCAD | 70.6 [57.8, 83.5] · n=4 |
| 10 | Qwen3 Coder → CadQuery | 65.2 [21.7, 87.4] · n=4 |
| 11 | Claude Haiku 4.5 → CadQuery | 57.6 [56.9, 58.1] · n=4 |
| 12 | Gemini 2.5 Flash → CadQuery | 51.1 [15.0, 78.6] · n=4 |
| 13 | GPT-5 → CadQuery | 47.0 [15.9, 63.5] · n=4 |
| 14 | Llama 3.3 70B → OpenSCAD | 41.9 [39.4, 44.4] · n=4 |
| 15 | GPT-5 Mini → OpenSCAD | 41.7 [38.7, 47.5] · n=4 |
| 16 | Gemini 2.5 Pro → OpenSCAD | 34.5 [33.7, 35.1] · n=4 |
| 17 | Hunyuan3D-2 | 33.5 [33.2, 33.7] · n=4 |
| 18 | Trellis 3D | 26.3 [26.1, 26.5] · n=4 |
| 19 | Claude Opus 4.7 → OpenSCAD | 26.1 [8.7, 35.3] · n=4 |
| 20 | Spline AI | 23.5 [23.4, 23.5] · n=4 |