cadforge-cadquery-openenv / docs /brainstorm /05-mechforge-rendering-stack.md
sanjuhs's picture
Upload CADForge judge evidence docs
ca16fdf verified
# MechForge Rendering And Simulation Stack
Date: 2026-04-24
## The Confusion To Resolve
For MechForge there are four separate jobs:
1. Generate or modify a design.
2. Render the design so humans can inspect it.
3. Simulate or verify the design.
4. Export the design to real CAD/manufacturing formats.
One tool does not need to do all four.
## Recommended MVP Stack
| Layer | MVP choice | Why |
|---|---|---|
| Design representation | Structured parametric JSON | Easy for LLMs, easy to validate, easy to convert. |
| Browser renderer | Three.js | Fast, visual, interactive, works inside a web demo. |
| Fast verifier | Custom beam/truss-style solver | Good enough for reward curves and RL feedback. |
| Export | STL from Three.js mesh | Immediate tangible artifact. |
| Future CAD backend | CadQuery first, OpenSCAD second | CadQuery is Python-native and more flexible for OpenEnv. |
| Future simulation backend | simplified FEM, FEniCSx, or specialized solver | Swap in after the environment loop works. |
## Why Not OpenSCAD First?
OpenSCAD is good for deterministic programmatic CAD. It is available on macOS and can generate real geometry, but it is not the fastest path for a live web app.
Use OpenSCAD later if we want:
- scriptable constructive solid geometry,
- reproducible `.scad` artifacts,
- STL export through the OpenSCAD CLI,
- simple parts made from unions/differences.
For the first experiment, Three.js is better because it gives immediate visual feedback in the browser.
## Why Not Full FEA First?
Full FEA is the wrong first milestone. It risks spending the hackathon on meshing, solver stability, and packaging instead of the OpenEnv loop.
Better:
1. Start with a simplified verifier that produces a reward.
2. Show that LLM behavior improves under that reward.
3. Add higher-fidelity simulation only after the loop is stable.
The judges care most that the environment trains meaningful behavior and shows improvement. A simple but coherent verifier is acceptable if we explain the limitations honestly.
## Benchmark Plan
Before committing to the full environment, run GPT-5.4 through a small prompt-to-design benchmark:
- Prompt asks for a lightweight bracket under a load case.
- Model returns structured design JSON.
- Renderer shows the part.
- Verifier scores mass, stress proxy, deflection proxy, safety factor, and manufacturability.
- We inspect whether the model uses real design patterns like ribs, load paths, holes in low-stress areas, and avoids invalid geometry.
This tells us whether current frontier models already solve the task or whether there is room for RL improvement.
## What The Experiment App Does
The app in `experiment-mechanical-idea/` implements this benchmark:
- Frontend: Vite + Three.js.
- Backend: Express + OpenAI Responses API.
- Input: natural-language mechanical design prompt.
- Output: structured parametric design JSON.
- Render: plate, ribs, holes, bosses, fixed holes, load arrow.
- Verifier: fast beam-style estimate.
- Export: STL from the rendered mesh.
## Final Recommendation
For the OpenEnv version:
1. Keep the agent action space constrained.
2. Use Three.js for the judge-facing demo.
3. Use Python/CadQuery later for real CAD export.
4. Keep simulation/verifier independent from the renderer.
5. Do not let the LLM generate arbitrary meshes in the first version.