CAD-Bench
← back
SURF-002 · Free-form Surfaces · difficulty 5/5

Compressor blade (NACA 65-(12)10)

sha256:8de14b209c01ac72

§1Prompt verbatim

Single compressor blade: 60 mm chord, 80 mm span, 12° twist root-to-tip, NACA-65-(12)10 thickness distribution along the camber line. G2 continuous suction and pressure surfaces, sharp trailing edge at 0.3 mm.

§2Ground-truth spec

shells1
watertighttrue
manifoldtrue
acceptance ε±0.05 mm

§3Reference render

canonical reference · drag to orbit, scroll to zoom

Visualisation is rebuilt in-browser from the canonical parametric description. Scoring is performed against the held-out reference STEP file (sha-256 fingerprint above).

§4Per-agent renders

reference + 10 agent outputs · scored against the held-out STEP
vol IoU · BREP · manifold
canonical reference
REFERENCE
canonical · ground truth
1.000100
Human Baseline (Mech-E)
Human Baseline (Mech-E)
n=4 senior engineers
0.63010
Trellis 3D
Trellis 3D
Microsoft Research
0.5830
Hunyuan3D-2
Hunyuan3D-2
Tencent
0.57911
Zoo Text-to-CAD
Zoo Text-to-CAD
Zoo (KittyCAD)
0.45511
Claude Opus 4.7 → CadQuery
Claude Opus 4.7 → CadQuery
Anthropic + CadQuery 2.4
0.44415
Gemini 2.5 Flash → CadQuery
Gemini 2.5 Flash → CadQuery
Google + CadQuery 2.4
0.43211
Adam (CADcrush)
Adam (CADcrush)
CADcrush
0.39414
OpenAI o4 (reasoning) → CadQuery
OpenAI o4 (reasoning) → CadQuery
OpenAI + CadQuery 2.4
0.38214
GPT-5 Mini → OpenSCAD
GPT-5 Mini → OpenSCAD
OpenAI + OpenSCAD 2024.06
0.36716
Spline AI
Spline AI
Spline.design
0.3540
Claude Opus 4.7 → OpenSCAD
Claude Opus 4.7 → OpenSCAD
Anthropic + OpenSCAD 2024.06
0.3300
Claude Haiku 4.5 → CadQuery
Claude Haiku 4.5 → CadQuery
Anthropic + CadQuery 2.4
0.28119
DeepSeek R1 (reasoning) → CadQuery
DeepSeek R1 (reasoning) → CadQuery
DeepSeek + CadQuery 2.4
0.22720
Claude Sonnet 4.6 → CadQuery
Claude Sonnet 4.6 → CadQuery
Anthropic + CadQuery 2.4
0.22522
CAD-Coder R1
CAD-Coder R1
CAD-Coder Labs (research)
0.22126
Gemini 2.5 Pro → OpenSCAD
Gemini 2.5 Pro → OpenSCAD
Google + OpenSCAD 2024.06
0.2190
Qwen3 Coder → CadQuery
Qwen3 Coder → CadQuery
Alibaba + CadQuery 2.4
0.15530
DeepCAD
DeepCAD
Wu et al. 2021 (research)
0.09251
Llama 3.3 70B → OpenSCAD
Llama 3.3 70B → OpenSCAD
Meta + OpenSCAD 2024.06
0.08655
no manifold solid produced
GPT-5 → CadQuery
GPT-5 → CadQuery
OpenAI + CadQuery 2.4
66

Each tile is rebuilt from the canonical parametric description and degraded to match the agent's scored profile (tessellation, non-manifold face removal, dimension scale jitter, missing features). Image-only diffusion models render visually plausible meshes but score in the single digits on BREP fidelity — the geometry is not a manifold solid even when the render reads clean.

§5Per-agent metrics

ranked by Vol IoU · same data as the leaderboard, restricted to this task
AgentBidirectional ChamferHausdorff p95NormConsWatert.Manif.P@1p50latencycost
Human Baseline (Mech-E)0.1430.7140.8870.9510.000721.2s$6.144
Trellis 3D0.1720.7410.8540.9410.00014.9s$0.059
Hunyuan3D-20.1330.7950.8590.9410.00025.3s$0.070
Zoo Text-to-CAD0.1730.8540.8060.9210.0006.6s$0.202
Claude Opus 4.7 → CadQuery0.2060.9010.7830.9130.00040.8s$0.363
Gemini 2.5 Flash → CadQuery0.1950.9960.7890.9150.00015.1s$0.020
Adam (CADcrush)0.2151.1290.756×0.9070.0008.6s$0.286
OpenAI o4 (reasoning) → CadQuery0.2111.0700.759×0.9040.000131.1s$1.172
GPT-5 Mini → OpenSCAD0.2501.0910.752×0.9040.00017.5s$0.010
Spline AI0.2021.0330.765×0.9070.0007.8s$0.038
Claude Opus 4.7 → OpenSCAD0.2781.3550.750×0.8980.00039.8s$0.326
Claude Haiku 4.5 → CadQuery0.2921.3130.736×0.8950.0006.6s$0.023
DeepSeek R1 (reasoning) → CadQuery0.3191.6080.708×0.8850.000118.4s$0.039
Claude Sonnet 4.6 → CadQuery0.3711.8240.696×0.8820.00017.6s$0.077
CAD-Coder R10.3841.8520.704×0.8820.0006.7s$0.005
Gemini 2.5 Pro → OpenSCAD0.3571.6990.697×0.8840.00029.2s$0.108
Qwen3 Coder → CadQuery0.4652.3760.673×0.8750.00016.9s$0.025
DeepCAD0.7733.7410.648×0.8650.0003.5s$0.016
Llama 3.3 70B → OpenSCAD0.8634.3340.650×0.8630.00018.0s$0.018
GPT-5 → CadQuery
kernel error: BRepCheck_NotClosed
×0.0000.00044.2s$0.213