CAD-Bench
← back
SURF-007 · Free-form Surfaces · difficulty 5/5

Mouse top-shell (Class-A)

sha256:1a2b03c0fde901ee

§1Prompt verbatim

Computer-mouse top shell: 110 × 65 mm footprint, 38 mm peak height, two scroll-wheel cutouts 20 × 6 mm symmetric about the centerline 30 mm from the back. Class-A: G2 across the entire shell, max curvature deviation < 1 mm⁻¹.

§2Ground-truth spec

shells1
watertighttrue
manifoldtrue
acceptance ε±0.1 mm
featuresg2_class_a, scroll_cutout_x2

§3Reference render

canonical reference · drag to orbit, scroll to zoom

Visualisation is rebuilt in-browser from the canonical parametric description. Scoring is performed against the held-out reference STEP file (sha-256 fingerprint above).

§4Per-agent renders

reference + 10 agent outputs · scored against the held-out STEP
vol IoU · BREP · manifold
canonical reference
REFERENCE
canonical · ground truth
1.000100
Hunyuan3D-2
Hunyuan3D-2
Tencent
0.74411
Human Baseline (Mech-E)
Human Baseline (Mech-E)
n=4 senior engineers
0.7448
Claude Opus 4.7 → CadQuery
Claude Opus 4.7 → CadQuery
Anthropic + CadQuery 2.4
0.49913
OpenAI o4 (reasoning) → CadQuery
OpenAI o4 (reasoning) → CadQuery
OpenAI + CadQuery 2.4
0.49010
Trellis 3D
Trellis 3D
Microsoft Research
0.4860
Zoo Text-to-CAD
Zoo Text-to-CAD
Zoo (KittyCAD)
0.44310
Claude Opus 4.7 → OpenSCAD
Claude Opus 4.7 → OpenSCAD
Anthropic + OpenSCAD 2024.06
0.4100
GPT-5 → CadQuery
GPT-5 → CadQuery
OpenAI + CadQuery 2.4
0.38712
DeepSeek R1 (reasoning) → CadQuery
DeepSeek R1 (reasoning) → CadQuery
DeepSeek + CadQuery 2.4
0.35717
Gemini 2.5 Flash → CadQuery
Gemini 2.5 Flash → CadQuery
Google + CadQuery 2.4
0.34719
Claude Sonnet 4.6 → CadQuery
Claude Sonnet 4.6 → CadQuery
Anthropic + CadQuery 2.4
0.32717
Adam (CADcrush)
Adam (CADcrush)
CADcrush
0.30615
Gemini 2.5 Pro → OpenSCAD
Gemini 2.5 Pro → OpenSCAD
Google + OpenSCAD 2024.06
0.2890
Llama 3.3 70B → OpenSCAD
Llama 3.3 70B → OpenSCAD
Meta + OpenSCAD 2024.06
0.26818
Qwen3 Coder → CadQuery
Qwen3 Coder → CadQuery
Alibaba + CadQuery 2.4
0.24922
Claude Haiku 4.5 → CadQuery
Claude Haiku 4.5 → CadQuery
Anthropic + CadQuery 2.4
0.18028
GPT-5 Mini → OpenSCAD
GPT-5 Mini → OpenSCAD
OpenAI + OpenSCAD 2024.06
0.16135
CAD-Coder R1
CAD-Coder R1
CAD-Coder Labs (research)
0.13632
DeepCAD
DeepCAD
Wu et al. 2021 (research)
0.04699
no manifold solid produced
Spline AI
Spline AI
Spline.design
2

Each tile is rebuilt from the canonical parametric description and degraded to match the agent's scored profile (tessellation, non-manifold face removal, dimension scale jitter, missing features). Image-only diffusion models render visually plausible meshes but score in the single digits on BREP fidelity — the geometry is not a manifold solid even when the render reads clean.

§5Per-agent metrics

ranked by Vol IoU · same data as the leaderboard, restricted to this task
AgentBidirectional ChamferHausdorff p95NormConsWatert.Manif.P@1p50latencycost
Hunyuan3D-20.1330.5740.9030.9570.00028.0s$0.083
Human Baseline (Mech-E)0.1450.5590.9190.9661.000753.3s$6.621
Claude Opus 4.7 → CadQuery0.1840.7540.8230.9250.00048.7s$0.289
OpenAI o4 (reasoning) → CadQuery0.1510.8390.8190.9280.000128.1s$1.062
Trellis 3D0.1880.8570.8070.9240.0008.7s$0.056
Zoo Text-to-CAD0.2140.9990.7990.9180.0004.7s$0.179
Claude Opus 4.7 → OpenSCAD0.1980.9610.779×0.9090.00040.5s$0.249
GPT-5 → CadQuery0.2311.1280.767×0.9100.00035.9s$0.195
DeepSeek R1 (reasoning) → CadQuery0.2471.1410.748×0.9020.000106.8s$0.039
Gemini 2.5 Flash → CadQuery0.2631.2510.752×0.9010.00013.0s$0.017
Claude Sonnet 4.6 → CadQuery0.2421.1610.735×0.8980.00020.9s$0.061
Adam (CADcrush)0.2361.2070.753×0.9000.0008.2s$0.266
Gemini 2.5 Pro → OpenSCAD0.2381.2000.746×0.8970.00023.2s$0.101
Llama 3.3 70B → OpenSCAD0.3251.4570.710×0.8890.00015.8s$0.018
Qwen3 Coder → CadQuery0.3211.3950.729×0.8910.00014.3s$0.031
Claude Haiku 4.5 → CadQuery0.4582.2660.691×0.8770.0007.4s$0.020
GPT-5 Mini → OpenSCAD0.5112.6310.676×0.8730.00014.4s$0.011
CAD-Coder R10.5282.5480.674×0.8720.0007.4s$0.005
DeepCAD1.5477.8120.623×0.8570.0005.8s$0.019
Spline AI
kernel error: BRepCheck_NotClosed
×0.0000.0009.1s$0.033