CAD-Bench
← back
PARAM-013 · Constraint Solving & Editability · difficulty 4/5

Editable bracket (length+30 %, hole→M8)

sha256:fab07d2c5e914421

§1Prompt verbatim

Build the L-bracket from MECH-014, then perform two parametric edits in sequence: (1) increase the long leg from 60 → 78 mm; (2) change the through-hole from M6 clearance to M8 clearance (Ø 9.0 mm). Topology must remain valid throughout.

§2Ground-truth spec

shells1
watertighttrue
manifoldtrue
acceptance ε±0.1 mm
parametric edits
leg_long : 6078 (ΔV expected 3600 mm³)
hole_d : 6.69 (ΔV expected -147 mm³)

§3Reference render

canonical reference · drag to orbit, scroll to zoom

Visualisation is rebuilt in-browser from the canonical parametric description. Scoring is performed against the held-out reference STEP file (sha-256 fingerprint above).

§4Per-agent renders

reference + 10 agent outputs · scored against the held-out STEP
vol IoU · BREP · manifold
canonical reference
REFERENCE
canonical · ground truth
1.000100
GPT-5 → CadQuery
GPT-5 → CadQuery
OpenAI + CadQuery 2.4
0.75110
OpenAI o4 (reasoning) → CadQuery
OpenAI o4 (reasoning) → CadQuery
OpenAI + CadQuery 2.4
0.72810
Human Baseline (Mech-E)
Human Baseline (Mech-E)
n=4 senior engineers
0.71810
DeepSeek R1 (reasoning) → CadQuery
DeepSeek R1 (reasoning) → CadQuery
DeepSeek + CadQuery 2.4
0.64812
Claude Opus 4.7 → CadQuery
Claude Opus 4.7 → CadQuery
Anthropic + CadQuery 2.4
0.6428
Adam (CADcrush)
Adam (CADcrush)
CADcrush
0.62511
Gemini 2.5 Flash → CadQuery
Gemini 2.5 Flash → CadQuery
Google + CadQuery 2.4
0.54912
Zoo Text-to-CAD
Zoo Text-to-CAD
Zoo (KittyCAD)
0.54413
Qwen3 Coder → CadQuery
Qwen3 Coder → CadQuery
Alibaba + CadQuery 2.4
0.53411
Claude Sonnet 4.6 → CadQuery
Claude Sonnet 4.6 → CadQuery
Anthropic + CadQuery 2.4
0.52610
Gemini 2.5 Pro → OpenSCAD
Gemini 2.5 Pro → OpenSCAD
Google + OpenSCAD 2024.06
0.4590
CAD-Coder R1
CAD-Coder R1
CAD-Coder Labs (research)
0.42915
Claude Haiku 4.5 → CadQuery
Claude Haiku 4.5 → CadQuery
Anthropic + CadQuery 2.4
0.36516
GPT-5 Mini → OpenSCAD
GPT-5 Mini → OpenSCAD
OpenAI + OpenSCAD 2024.06
0.32615
Llama 3.3 70B → OpenSCAD
Llama 3.3 70B → OpenSCAD
Meta + OpenSCAD 2024.06
0.22525
DeepCAD
DeepCAD
Wu et al. 2021 (research)
0.16732
no manifold solid produced
Claude Opus 4.7 → OpenSCAD
Claude Opus 4.7 → OpenSCAD
Anthropic + OpenSCAD 2024.06
18
Trellis 3D
Trellis 3D
Microsoft Research
0.0000
Spline AI
Spline AI
Spline.design
0.0000
Hunyuan3D-2
Hunyuan3D-2
Tencent
0.000104

Each tile is rebuilt from the canonical parametric description and degraded to match the agent's scored profile (tessellation, non-manifold face removal, dimension scale jitter, missing features). Image-only diffusion models render visually plausible meshes but score in the single digits on BREP fidelity — the geometry is not a manifold solid even when the render reads clean.

§5Per-agent metrics

ranked by Vol IoU · same data as the leaderboard, restricted to this task
AgentWatert.Manif.ParamEditConSolveParametric Range IntegrityP@1p50latencycost
GPT-5 → CadQuery0.9580.6930.7130.7261.00039.1s$0.226
OpenAI o4 (reasoning) → CadQuery0.9540.8660.7920.7600.00081.5s$1.086
Human Baseline (Mech-E)0.9570.8530.8010.8450.000711.6s$5.129
DeepSeek R1 (reasoning) → CadQuery0.9450.6900.7300.6480.000122.3s$0.037
Claude Opus 4.7 → CadQuery0.9510.7320.7460.7560.00046.9s$0.389
Adam (CADcrush)0.9520.7400.7480.6810.00010.1s$0.237
Gemini 2.5 Flash → CadQuery0.9390.6260.5620.5150.0009.5s$0.017
Zoo Text-to-CAD0.9370.6610.7340.6510.0007.7s$0.192
Qwen3 Coder → CadQuery0.9300.6470.6600.6040.00012.7s$0.035
Claude Sonnet 4.6 → CadQuery0.9360.7370.8070.6510.00016.2s$0.065
Gemini 2.5 Pro → OpenSCAD0.9230.5990.5850.5440.00026.4s$0.090
CAD-Coder R10.9140.6150.6100.5490.0005.0s$0.005
Claude Haiku 4.5 → CadQuery×0.9080.5470.4860.3990.00010.2s$0.017
GPT-5 Mini → OpenSCAD×0.9040.4490.4260.4240.00017.3s$0.010
Llama 3.3 70B → OpenSCAD×0.8830.4110.4440.3450.00024.0s$0.021
DeepCAD×0.8750.2970.3200.2450.0004.4s$0.019
Claude Opus 4.7 → OpenSCAD
kernel error: BRepCheck_NotClosed
×0.0000.00025.8s$0.316
Trellis 3D×0.8500.0570.0560.0360.00011.7s$0.050
Spline AI×0.8500.0470.0460.0180.0007.0s$0.043
Hunyuan3D-2×0.8500.0480.0460.0370.00025.2s$0.076