Constraint Solving & Editability

Probes whether the agent exposes a working parametric graph: after the part is built we issue downstream parameter edits (length+30 %, hole diameter→M8) and re-evaluate without topological breakage.

Parametric Edit Accuracy · ratio · ↑Parametric Range Integrity · ratio · ↑Constraint Solve Rate · ratio · ↑

RANKED AGENTS · 95 % CI

#	Agent	Score
1	Human Baseline (Mech-E)	86.6 [83.3, 89.0] · n=3
2	OpenAI o4 (reasoning) → CadQuery	80.2 [79.3, 80.7] · n=3
3	Claude Opus 4.7 → CadQuery	77.9 [74.5, 80.8] · n=3
4	Adam (CADcrush)	75.0 [72.3, 77.3] · n=3
5	Claude Sonnet 4.6 → CadQuery	72.9 [71.6, 73.8] · n=3
6	GPT-5 → CadQuery	72.3 [71.1, 73.7] · n=3
7	DeepSeek R1 (reasoning) → CadQuery	70.9 [68.9, 72.7] · n=3
8	Zoo Text-to-CAD	68.0 [66.0, 69.9] · n=3
9	Qwen3 Coder → CadQuery	62.4 [61.0, 63.7] · n=3
10	CAD-Coder R1	57.6 [56.2, 59.1] · n=3
11	Gemini 2.5 Pro → OpenSCAD	56.1 [54.0, 57.6] · n=3
12	Claude Haiku 4.5 → CadQuery	48.9 [47.7, 50.0] · n=3
13	GPT-5 Mini → OpenSCAD	44.2 [43.3, 44.7] · n=3
14	Llama 3.3 70B → OpenSCAD	40.4 [39.2, 41.9] · n=3
15	Gemini 2.5 Flash → CadQuery	39.1 [0.0, 60.7] · n=3
16	Claude Opus 4.7 → OpenSCAD	38.5 [0.0, 58.5] · n=3
17	DeepCAD	27.2 [26.4, 28.7] · n=3
18	Hunyuan3D-2	4.3 [4.2, 4.4] · n=3
19	Spline AI	3.8 [3.7, 3.9] · n=3
20	Trellis 3D	3.3 [0.0, 5.0] · n=3

TASKS IN THIS CATEGORY

PARAM-006Editable flange (bolt circle param sweep)d3/5 PARAM-013Editable bracket (length+30 %, hole→M8)d4/5 PARAM-009Configurable bottle (height + cap params)d4/5