CAD-Bench
← back
MECH-031 · Parametric Mechanical Parts · difficulty 5/5

Threaded cap with diamond knurl

sha256:7caf01eb22ddc041

§1Prompt verbatim

Cylindrical cap Ø 35 × 18 mm tall, internal M30 × 1.5 thread depth 14 mm, exterior diamond knurl pitch 0.8 mm, knurl height 0.3 mm, covering the central 12 mm of the height. Top face flat, bottom face open. ISO 261 thread tolerance 6H.

§2Ground-truth spec

shells1
watertighttrue
manifoldtrue
acceptance ε±0.05 mm
featuresthread_M30x1.5, knurl_diamond_0.8
Knurls are the canonical 'looks easy, isn't' surface — most LLMs emit a flat texture map, not actual geometric ridges.

§3Reference render

canonical reference · drag to orbit, scroll to zoom

Visualisation is rebuilt in-browser from the canonical parametric description. Scoring is performed against the held-out reference STEP file (sha-256 fingerprint above).

§4Per-agent renders

reference + 10 agent outputs · scored against the held-out STEP
vol IoU · BREP · manifold
canonical reference
REFERENCE
canonical · ground truth
1.000100
Human Baseline (Mech-E)
Human Baseline (Mech-E)
n=4 senior engineers
0.7407
Claude Opus 4.7 → CadQuery
Claude Opus 4.7 → CadQuery
Anthropic + CadQuery 2.4
0.6468
CAD-Coder R1
CAD-Coder R1
CAD-Coder Labs (research)
0.6469
DeepSeek R1 (reasoning) → CadQuery
DeepSeek R1 (reasoning) → CadQuery
DeepSeek + CadQuery 2.4
0.6449
Adam (CADcrush)
Adam (CADcrush)
CADcrush
0.6298
Zoo Text-to-CAD
Zoo Text-to-CAD
Zoo (KittyCAD)
0.61112
OpenAI o4 (reasoning) → CadQuery
OpenAI o4 (reasoning) → CadQuery
OpenAI + CadQuery 2.4
0.6039
Claude Opus 4.7 → OpenSCAD
Claude Opus 4.7 → OpenSCAD
Anthropic + OpenSCAD 2024.06
0.5550
Claude Sonnet 4.6 → CadQuery
Claude Sonnet 4.6 → CadQuery
Anthropic + CadQuery 2.4
0.51613
Claude Haiku 4.5 → CadQuery
Claude Haiku 4.5 → CadQuery
Anthropic + CadQuery 2.4
0.44812
Qwen3 Coder → CadQuery
Qwen3 Coder → CadQuery
Alibaba + CadQuery 2.4
0.42613
Gemini 2.5 Flash → CadQuery
Gemini 2.5 Flash → CadQuery
Google + CadQuery 2.4
0.41512
Gemini 2.5 Pro → OpenSCAD
Gemini 2.5 Pro → OpenSCAD
Google + OpenSCAD 2024.06
0.3940
GPT-5 → CadQuery
GPT-5 → CadQuery
OpenAI + CadQuery 2.4
0.38415
DeepCAD
DeepCAD
Wu et al. 2021 (research)
0.34014
Llama 3.3 70B → OpenSCAD
Llama 3.3 70B → OpenSCAD
Meta + OpenSCAD 2024.06
0.30118
GPT-5 Mini → OpenSCAD
GPT-5 Mini → OpenSCAD
OpenAI + OpenSCAD 2024.06
0.26822
Hunyuan3D-2
Hunyuan3D-2
Tencent
0.09554
Trellis 3D
Trellis 3D
Microsoft Research
0.0000
Spline AI
Spline AI
Spline.design
0.0000

Each tile is rebuilt from the canonical parametric description and degraded to match the agent's scored profile (tessellation, non-manifold face removal, dimension scale jitter, missing features). Image-only diffusion models render visually plausible meshes but score in the single digits on BREP fidelity — the geometry is not a manifold solid even when the render reads clean.

§5Per-agent metrics

ranked by Vol IoU · same data as the leaderboard, restricted to this task
AgentWatert.Manif.Named-Dimension RMSEGD&T ComplianceFeatRecP@1p50latencycost
Human Baseline (Mech-E)0.9680.1380.8270.9450.000790.9s$6.933
Claude Opus 4.7 → CadQuery0.9560.2000.6160.6980.00049.3s$0.334
CAD-Coder R10.9490.2680.4680.7060.0006.4s$0.005
DeepSeek R1 (reasoning) → CadQuery0.9470.2590.5060.6240.00087.9s$0.038
Adam (CADcrush)0.9420.2220.6770.6530.00010.0s$0.319
Zoo Text-to-CAD0.9420.2310.6920.8040.0008.0s$0.154
OpenAI o4 (reasoning) → CadQuery0.9480.1950.6550.7250.000103.1s$1.049
Claude Opus 4.7 → OpenSCAD0.9350.2230.4340.6000.00033.7s$0.289
Claude Sonnet 4.6 → CadQuery0.9280.2050.6200.6920.00019.3s$0.066
Claude Haiku 4.5 → CadQuery0.9150.3610.3540.5350.0007.9s$0.019
Qwen3 Coder → CadQuery0.9150.3340.3930.5620.00016.8s$0.035
Gemini 2.5 Flash → CadQuery×0.9130.2380.4440.6100.0009.5s$0.019
Gemini 2.5 Pro → OpenSCAD×0.9090.3040.4270.5080.00031.3s$0.107
GPT-5 → CadQuery×0.9120.2200.5400.6730.00028.7s$0.185
DeepCAD×0.9010.3720.3840.4770.0005.3s$0.021
Llama 3.3 70B → OpenSCAD×0.8930.3960.2680.4490.00027.7s$0.017
GPT-5 Mini → OpenSCAD×0.8890.3070.2670.4090.00010.3s$0.009
Hunyuan3D-2×0.8640.5410.0550.2130.00029.7s$0.075
Trellis 3D×0.8500.4730.0550.2110.00010.2s$0.059
Spline AI×0.8500.5140.0270.0910.0007.8s$0.034