CAD-Bench
← back
FUNC-001 · Functional Intent · FEA-Gated · difficulty 5/5

Cantilever bracket — 250 N tip load, 6061-T6, SF≥4

sha256:33a91efbac0c1110

§1Prompt verbatim

Design a cantilever bracket that bolts to a wall via two M6 holes 50 mm apart and supports a 250 N transverse tip load 80 mm from the wall, in 6061-T6 aluminium, with a static safety factor ≥ 4 against yield (σ_y = 276 MPa). Mass should be ≤ 60 g. Output STEP.

§2Ground-truth spec

shells1
watertighttrue
manifoldtrue
acceptance ε±0.1 mm
featuresM6_clearance_x2, load_pad

§3Reference render

canonical reference · drag to orbit, scroll to zoom

Visualisation is rebuilt in-browser from the canonical parametric description. Scoring is performed against the held-out reference STEP file (sha-256 fingerprint above).

§4Per-agent renders

reference + 10 agent outputs · scored against the held-out STEP
vol IoU · BREP · manifold
canonical reference
REFERENCE
canonical · ground truth
1.000100
Human Baseline (Mech-E)
Human Baseline (Mech-E)
n=4 senior engineers
0.65211
OpenAI o4 (reasoning) → CadQuery
OpenAI o4 (reasoning) → CadQuery
OpenAI + CadQuery 2.4
0.5748
Zoo Text-to-CAD
Zoo Text-to-CAD
Zoo (KittyCAD)
0.46712
DeepSeek R1 (reasoning) → CadQuery
DeepSeek R1 (reasoning) → CadQuery
DeepSeek + CadQuery 2.4
0.43710
Claude Sonnet 4.6 → CadQuery
Claude Sonnet 4.6 → CadQuery
Anthropic + CadQuery 2.4
0.40412
GPT-5 → CadQuery
GPT-5 → CadQuery
OpenAI + CadQuery 2.4
0.40216
Gemini 2.5 Flash → CadQuery
Gemini 2.5 Flash → CadQuery
Google + CadQuery 2.4
0.34718
Claude Opus 4.7 → CadQuery
Claude Opus 4.7 → CadQuery
Anthropic + CadQuery 2.4
0.33917
GPT-5 Mini → OpenSCAD
GPT-5 Mini → OpenSCAD
OpenAI + OpenSCAD 2024.06
0.20126
Adam (CADcrush)
Adam (CADcrush)
CADcrush
0.18825
Qwen3 Coder → CadQuery
Qwen3 Coder → CadQuery
Alibaba + CadQuery 2.4
0.18827
CAD-Coder R1
CAD-Coder R1
CAD-Coder Labs (research)
0.18728
Llama 3.3 70B → OpenSCAD
Llama 3.3 70B → OpenSCAD
Meta + OpenSCAD 2024.06
0.07563
Claude Haiku 4.5 → CadQuery
Claude Haiku 4.5 → CadQuery
Anthropic + CadQuery 2.4
0.05781
no manifold solid produced
Gemini 2.5 Pro → OpenSCAD
Gemini 2.5 Pro → OpenSCAD
Google + OpenSCAD 2024.06
18
no manifold solid produced
Claude Opus 4.7 → OpenSCAD
Claude Opus 4.7 → OpenSCAD
Anthropic + OpenSCAD 2024.06
18
DeepCAD
DeepCAD
Wu et al. 2021 (research)
0.000103
Trellis 3D
Trellis 3D
Microsoft Research
0.0000
no manifold solid produced
Spline AI
Spline AI
Spline.design
2
Hunyuan3D-2
Hunyuan3D-2
Tencent
0.000103

Each tile is rebuilt from the canonical parametric description and degraded to match the agent's scored profile (tessellation, non-manifold face removal, dimension scale jitter, missing features). Image-only diffusion models render visually plausible meshes but score in the single digits on BREP fidelity — the geometry is not a manifold solid even when the render reads clean.

§5Per-agent metrics

ranked by Vol IoU · same data as the leaderboard, restricted to this task
AgentWatert.Manif.FeatRecMin-Wall ComplianceFEA-Yield PassP@1p50latencycost
Human Baseline (Mech-E)0.9480.9120.8280.7290.000700.5s$5.421
OpenAI o4 (reasoning) → CadQuery0.9400.7610.6260.7090.00098.6s$1.007
Zoo Text-to-CAD0.9180.7020.6840.4860.0004.5s$0.172
DeepSeek R1 (reasoning) → CadQuery0.9190.6890.5540.4940.000112.1s$0.040
Claude Sonnet 4.6 → CadQuery×0.9120.6810.6140.5130.00017.4s$0.083
GPT-5 → CadQuery×0.9100.6850.6390.5230.00053.2s$0.244
Gemini 2.5 Flash → CadQuery×0.8990.6170.5380.3610.00012.6s$0.023
Claude Opus 4.7 → CadQuery×0.9030.7230.6660.5690.00046.6s$0.366
GPT-5 Mini → OpenSCAD×0.8790.3970.4710.2450.00013.8s$0.010
Adam (CADcrush)×0.8790.7290.5960.3850.00010.2s$0.270
Qwen3 Coder → CadQuery×0.8800.5700.5210.3660.00018.3s$0.031
CAD-Coder R1×0.8780.6770.5420.2690.0007.0s$0.005
Llama 3.3 70B → OpenSCAD×0.8620.4270.4230.2430.00018.6s$0.022
Claude Haiku 4.5 → CadQuery×0.8590.4850.4650.2820.0006.0s$0.017
Gemini 2.5 Pro → OpenSCAD
kernel error: BRepCheck_NotClosed
×0.0000.00029.6s$0.084
Claude Opus 4.7 → OpenSCAD
kernel error: BRepCheck_NotClosed
×0.0000.00034.8s$0.353
DeepCAD×0.8500.4860.3810.1880.0006.0s$0.021
Trellis 3D×0.8500.2120.2160.0860.00011.4s$0.056
Spline AI
kernel error: BRepCheck_NotClosed
×0.0000.0009.1s$0.046
Hunyuan3D-2×0.8500.2080.2210.0930.00045.1s$0.064