Reverse Engineering
Multi-view orthographic drawings (front/top/side at 1:1, fully dimensioned) and product photos. The agent must reproduce the part. Adapted from the ABC dataset and a held-out subset of GrabCAD test parts.
Volumetric IoU · ratio · ↑Feature Recall · ratio · ↑Named-Dimension RMSE · mm · ↓
RANKED AGENTS · 95 % CI
| # | Agent | Score |
|---|---|---|
| 1 | OpenAI o4 (reasoning) → CadQuery | 69.8 [65.7, 73.1] · n=3 |
| 2 | Claude Opus 4.7 → CadQuery | 68.4 [65.5, 70.2] · n=3 |
| 3 | Zoo Text-to-CAD | 66.5 [64.3, 69.7] · n=3 |
| 4 | GPT-5 → CadQuery | 63.1 [59.5, 67.6] · n=3 |
| 5 | DeepSeek R1 (reasoning) → CadQuery | 59.5 [57.7, 61.5] · n=3 |
| 6 | Adam (CADcrush) | 59.3 [57.4, 62.2] · n=3 |
| 7 | Gemini 2.5 Pro → OpenSCAD | 58.2 [57.6, 59.1] · n=3 |
| 8 | Qwen3 Coder → CadQuery | 57.6 [55.2, 62.1] · n=3 |
| 9 | CAD-Coder R1 | 56.2 [55.2, 57.2] · n=3 |
| 10 | Claude Opus 4.7 → OpenSCAD | 55.4 [53.1, 56.7] · n=3 |
| 11 | Human Baseline (Mech-E) | 55.2 [0.0, 83.6] · n=3 |
| 12 | Claude Haiku 4.5 → CadQuery | 48.1 [44.9, 51.0] · n=3 |
| 13 | GPT-5 Mini → OpenSCAD | 45.5 [43.9, 47.2] · n=3 |
| 14 | Claude Sonnet 4.6 → CadQuery | 43.7 [0.0, 67.8] · n=3 |
| 15 | DeepCAD | 43.0 [39.7, 46.5] · n=3 |
| 16 | Trellis 3D | 41.5 [38.9, 45.6] · n=3 |
| 17 | Hunyuan3D-2 | 41.3 [39.6, 43.1] · n=3 |
| 18 | Gemini 2.5 Flash → CadQuery | 37.5 [0.0, 58.1] · n=3 |
| 19 | Llama 3.3 70B → OpenSCAD | 29.4 [0.0, 44.6] · n=3 |
| 20 | Spline AI | 24.8 [24.4, 25.6] · n=3 |