Functional Intent · FEA-Gated
Prompts specify a *function* ("hold a 250 N transverse load with a 4× safety factor in 6061-T6") rather than a geometry. Score requires the agent's part to pass automatic linear-elastic FEA at the spec'd load with stress ≤ 0.8·σ_yield.
FEA-Yield Pass · ratio · ↑Min-Wall Compliance · ratio · ↑Feature Recall · ratio · ↑
RANKED AGENTS · 95 % CI
| # | Agent | Score |
|---|---|---|
| 1 | Human Baseline (Mech-E) | 85.5 [82.3, 88.5] · n=3 |
| 2 | OpenAI o4 (reasoning) → CadQuery | 68.3 [66.3, 69.9] · n=3 |
| 3 | Zoo Text-to-CAD | 64.6 [62.4, 65.9] · n=3 |
| 4 | Claude Opus 4.7 → CadQuery | 63.5 [61.5, 65.3] · n=3 |
| 5 | GPT-5 → CadQuery | 61.4 [61.3, 61.6] · n=3 |
| 6 | Claude Sonnet 4.6 → CadQuery | 60.3 [59.0, 61.8] · n=3 |
| 7 | DeepSeek R1 (reasoning) → CadQuery | 57.7 [55.0, 60.2] · n=3 |
| 8 | Gemini 2.5 Flash → CadQuery | 50.2 [49.2, 50.8] · n=3 |
| 9 | Qwen3 Coder → CadQuery | 50.1 [48.6, 51.1] · n=3 |
| 10 | CAD-Coder R1 | 48.5 [46.5, 49.6] · n=3 |
| 11 | Claude Haiku 4.5 → CadQuery | 40.8 [40.0, 41.3] · n=3 |
| 12 | Adam (CADcrush) | 37.7 [0.0, 57.0] · n=3 |
| 13 | GPT-5 Mini → OpenSCAD | 36.7 [36.2, 37.1] · n=3 |
| 14 | Claude Opus 4.7 → OpenSCAD | 34.5 [0.0, 52.2] · n=3 |
| 15 | DeepCAD | 34.4 [33.1, 35.2] · n=3 |
| 16 | Gemini 2.5 Pro → OpenSCAD | 31.0 [0.0, 47.1] · n=3 |
| 17 | Llama 3.3 70B → OpenSCAD | 23.5 [0.0, 36.4] · n=3 |
| 18 | Hunyuan3D-2 | 17.4 [16.7, 18.1] · n=3 |
| 19 | Trellis 3D | 17.2 [16.7, 17.7] · n=3 |
| 20 | Spline AI | 5.1 [0.0, 7.7] · n=3 |