CAD-Bench
← back
REVENG-009 · Reverse Engineering · difficulty 5/5

Three-view ortho → housing with cores

sha256:60189cae3b771acc

§1Prompt verbatim

Reproduce the 80 × 60 × 40 mm housing from the supplied multi-view drawing including all M4 tapped holes, draft, and ribs. Drawing follows ASME Y14.5-2018 third-angle convention.

§2Ground-truth spec

shells1
watertighttrue
manifoldtrue
acceptance ε±0.1 mm
featuresthread_M4_x6, rib_x4, draft_1deg

§3Reference render

canonical reference · drag to orbit, scroll to zoom

Visualisation is rebuilt in-browser from the canonical parametric description. Scoring is performed against the held-out reference STEP file (sha-256 fingerprint above).

§4Per-agent renders

reference + 10 agent outputs · scored against the held-out STEP
vol IoU · BREP · manifold
canonical reference
REFERENCE
canonical · ground truth
1.000100
Human Baseline (Mech-E)
Human Baseline (Mech-E)
n=4 senior engineers
0.7378
Trellis 3D
Trellis 3D
Microsoft Research
0.6230
Claude Opus 4.7 → CadQuery
Claude Opus 4.7 → CadQuery
Anthropic + CadQuery 2.4
0.57913
Hunyuan3D-2
Hunyuan3D-2
Tencent
0.47011
GPT-5 → CadQuery
GPT-5 → CadQuery
OpenAI + CadQuery 2.4
0.44110
OpenAI o4 (reasoning) → CadQuery
OpenAI o4 (reasoning) → CadQuery
OpenAI + CadQuery 2.4
0.42410
Gemini 2.5 Pro → OpenSCAD
Gemini 2.5 Pro → OpenSCAD
Google + OpenSCAD 2024.06
0.4210
Claude Opus 4.7 → OpenSCAD
Claude Opus 4.7 → OpenSCAD
Anthropic + OpenSCAD 2024.06
0.4000
Zoo Text-to-CAD
Zoo Text-to-CAD
Zoo (KittyCAD)
0.37716
DeepSeek R1 (reasoning) → CadQuery
DeepSeek R1 (reasoning) → CadQuery
DeepSeek + CadQuery 2.4
0.35914
CAD-Coder R1
CAD-Coder R1
CAD-Coder Labs (research)
0.33918
Qwen3 Coder → CadQuery
Qwen3 Coder → CadQuery
Alibaba + CadQuery 2.4
0.29616
GPT-5 Mini → OpenSCAD
GPT-5 Mini → OpenSCAD
OpenAI + OpenSCAD 2024.06
0.27421
Adam (CADcrush)
Adam (CADcrush)
CADcrush
0.27220
Spline AI
Spline AI
Spline.design
0.1910
Claude Haiku 4.5 → CadQuery
Claude Haiku 4.5 → CadQuery
Anthropic + CadQuery 2.4
0.18629
DeepCAD
DeepCAD
Wu et al. 2021 (research)
0.15431
no manifold solid produced
Claude Sonnet 4.6 → CadQuery
Claude Sonnet 4.6 → CadQuery
Anthropic + CadQuery 2.4
66
no manifold solid produced
Gemini 2.5 Flash → CadQuery
Gemini 2.5 Flash → CadQuery
Google + CadQuery 2.4
59
no manifold solid produced
Llama 3.3 70B → OpenSCAD
Llama 3.3 70B → OpenSCAD
Meta + OpenSCAD 2024.06
14

Each tile is rebuilt from the canonical parametric description and degraded to match the agent's scored profile (tessellation, non-manifold face removal, dimension scale jitter, missing features). Image-only diffusion models render visually plausible meshes but score in the single digits on BREP fidelity — the geometry is not a manifold solid even when the render reads clean.

§5Per-agent metrics

ranked by Vol IoU · same data as the leaderboard, restricted to this task
AgentVol IoUWatert.Manif.Named-Dimension RMSEFeatRecP@1p50latencycost
Human Baseline (Mech-E)0.7370.9570.0680.8380.000905.3s$6.088
Trellis 3D0.6230.9510.4630.2080.0009.5s$0.059
Claude Opus 4.7 → CadQuery0.5790.9370.1840.7100.00044.2s$0.316
Hunyuan3D-20.4700.9230.4770.1950.00044.8s$0.072
GPT-5 → CadQuery0.4410.9220.2870.6320.00041.1s$0.199
OpenAI o4 (reasoning) → CadQuery0.4240.9190.1730.7200.00089.6s$1.031
Gemini 2.5 Pro → OpenSCAD0.4210.9150.2310.5390.00021.8s$0.095
Claude Opus 4.7 → OpenSCAD0.400×0.9080.2260.5270.00024.1s$0.273
Zoo Text-to-CAD0.377×0.9100.1820.7690.0006.3s$0.209
DeepSeek R1 (reasoning) → CadQuery0.359×0.9010.2590.6830.000105.9s$0.033
CAD-Coder R10.339×0.9020.3180.6360.0007.2s$0.005
Qwen3 Coder → CadQuery0.296×0.8980.2470.6180.00017.4s$0.032
GPT-5 Mini → OpenSCAD0.274×0.8920.3300.4160.00012.9s$0.011
Adam (CADcrush)0.272×0.8880.2260.7040.0008.0s$0.263
Spline AI0.191×0.8810.5440.0890.0006.2s$0.033
Claude Haiku 4.5 → CadQuery0.186×0.8770.3800.5420.0006.8s$0.020
DeepCAD0.154×0.8730.3690.5000.0005.1s$0.016
Claude Sonnet 4.6 → CadQuery
kernel error: BRepCheck_NotClosed
0.000×0.0000.00016.8s$0.063
Gemini 2.5 Flash → CadQuery
kernel error: BRepCheck_NotClosed
0.000×0.0000.0009.2s$0.018
Llama 3.3 70B → OpenSCAD
kernel error: BRepCheck_NotClosed
0.000×0.0000.00018.8s$0.022