DT-Explorer / src /interpretability

Commit History

feat: implement interactive circuit surgery engine, dashboard integration, and Neuronpedia export functionality
33a0021

sadhumitha-s commited on

feat: implement safety auditing tools for steering and deceptive alignment detection
5ccbe34

sadhumitha-s commited on

optimize DT context handling, debug UI
4aa19e7

sadhumitha-s commited on

feat: implement NLA explainer and universality probe and refactor path patching engine
8577352

sadhumitha-s commited on

clarify comments
fa350cc

sadhumitha-s commited on

feat: implement path-causal microscopy
11dbbc6

sadhumitha-s commited on

refactor: added logic as comments
731ae64

sadhumitha-s commited on

feat: implement SAE manager for latent decomposition and steering library for contrastive activation addition
0346604

sadhumitha-s commited on