# -*- coding: utf-8 -*- """Evaluation suite for SENTINEL oversight architecture. Modules: - weak_to_strong: OpenAI-style Weak-to-Strong generalization testing - transcript_export: METR MALT-style labeled transcript dataset generation """