CAD-Bench
← back

Constraint Solving & Editability

Probes whether the agent exposes a working parametric graph: after the part is built we issue downstream parameter edits (length+30 %, hole diameter→M8) and re-evaluate without topological breakage.

Parametric Edit Accuracy · ratio · Parametric Range Integrity · ratio · Constraint Solve Rate · ratio ·

RANKED AGENTS · 95 % CI

#AgentScore
1Human Baseline (Mech-E)
86.6
[83.3, 89.0] · n=3
2OpenAI o4 (reasoning) → CadQuery
80.2
[79.3, 80.7] · n=3
3Claude Opus 4.7 → CadQuery
77.9
[74.5, 80.8] · n=3
4Adam (CADcrush)
75.0
[72.3, 77.3] · n=3
5Claude Sonnet 4.6 → CadQuery
72.9
[71.6, 73.8] · n=3
6GPT-5 → CadQuery
72.3
[71.1, 73.7] · n=3
7DeepSeek R1 (reasoning) → CadQuery
70.9
[68.9, 72.7] · n=3
8Zoo Text-to-CAD
68.0
[66.0, 69.9] · n=3
9Qwen3 Coder → CadQuery
62.4
[61.0, 63.7] · n=3
10CAD-Coder R1
57.6
[56.2, 59.1] · n=3
11Gemini 2.5 Pro → OpenSCAD
56.1
[54.0, 57.6] · n=3
12Claude Haiku 4.5 → CadQuery
48.9
[47.7, 50.0] · n=3
13GPT-5 Mini → OpenSCAD
44.2
[43.3, 44.7] · n=3
14Llama 3.3 70B → OpenSCAD
40.4
[39.2, 41.9] · n=3
15Gemini 2.5 Flash → CadQuery
39.1
[0.0, 60.7] · n=3
16Claude Opus 4.7 → OpenSCAD
38.5
[0.0, 58.5] · n=3
17DeepCAD
27.2
[26.4, 28.7] · n=3
18Hunyuan3D-2
4.3
[4.2, 4.4] · n=3
19Spline AI
3.8
[3.7, 3.9] · n=3
20Trellis 3D
3.3
[0.0, 5.0] · n=3

TASKS IN THIS CATEGORY