Spaces:

sanjuhs
/

cadforge-cadquery-openenv

Running

App Files Files Community

cadforge-cadquery-openenv / docs /brainstorm /05-mechforge-rendering-stack.md

sanjuhs

Upload CADForge judge evidence docs

ca16fdf verified 13 days ago

preview code

raw

history blame contribute delete

3.36 kB

	# MechForge Rendering And Simulation Stack

	Date: 2026-04-24

	## The Confusion To Resolve

	For MechForge there are four separate jobs:

	1. Generate or modify a design.
	2. Render the design so humans can inspect it.
	3. Simulate or verify the design.
	4. Export the design to real CAD/manufacturing formats.

	One tool does not need to do all four.

	## Recommended MVP Stack

	\| Layer \| MVP choice \| Why \|
	\|---\|---\|---\|
	\| Design representation \| Structured parametric JSON \| Easy for LLMs, easy to validate, easy to convert. \|
	\| Browser renderer \| Three.js \| Fast, visual, interactive, works inside a web demo. \|
	\| Fast verifier \| Custom beam/truss-style solver \| Good enough for reward curves and RL feedback. \|
	\| Export \| STL from Three.js mesh \| Immediate tangible artifact. \|
	\| Future CAD backend \| CadQuery first, OpenSCAD second \| CadQuery is Python-native and more flexible for OpenEnv. \|
	\| Future simulation backend \| simplified FEM, FEniCSx, or specialized solver \| Swap in after the environment loop works. \|

	## Why Not OpenSCAD First?

	OpenSCAD is good for deterministic programmatic CAD. It is available on macOS and can generate real geometry, but it is not the fastest path for a live web app.

	Use OpenSCAD later if we want:

	- scriptable constructive solid geometry,
	- reproducible `.scad` artifacts,
	- STL export through the OpenSCAD CLI,
	- simple parts made from unions/differences.

	For the first experiment, Three.js is better because it gives immediate visual feedback in the browser.

	## Why Not Full FEA First?

	Full FEA is the wrong first milestone. It risks spending the hackathon on meshing, solver stability, and packaging instead of the OpenEnv loop.

	Better:

	1. Start with a simplified verifier that produces a reward.
	2. Show that LLM behavior improves under that reward.
	3. Add higher-fidelity simulation only after the loop is stable.

	The judges care most that the environment trains meaningful behavior and shows improvement. A simple but coherent verifier is acceptable if we explain the limitations honestly.

	## Benchmark Plan

	Before committing to the full environment, run GPT-5.4 through a small prompt-to-design benchmark:

	- Prompt asks for a lightweight bracket under a load case.
	- Model returns structured design JSON.
	- Renderer shows the part.
	- Verifier scores mass, stress proxy, deflection proxy, safety factor, and manufacturability.
	- We inspect whether the model uses real design patterns like ribs, load paths, holes in low-stress areas, and avoids invalid geometry.

	This tells us whether current frontier models already solve the task or whether there is room for RL improvement.

	## What The Experiment App Does

	The app in `experiment-mechanical-idea/` implements this benchmark:

	- Frontend: Vite + Three.js.
	- Backend: Express + OpenAI Responses API.
	- Input: natural-language mechanical design prompt.
	- Output: structured parametric design JSON.
	- Render: plate, ribs, holes, bosses, fixed holes, load arrow.
	- Verifier: fast beam-style estimate.
	- Export: STL from the rendered mesh.

	## Final Recommendation

	For the OpenEnv version:

	1. Keep the agent action space constrained.
	2. Use Three.js for the judge-facing demo.
	3. Use Python/CadQuery later for real CAD export.
	4. Keep simulation/verifier independent from the renderer.
	5. Do not let the LLM generate arbitrary meshes in the first version.