--- license: agpl-3.0 tags: - learning-augmented-algorithms - automated-algorithm-design - crispo - code-generation - ai-code-generation --- # Crispo: Autonomous Co-Design of ML Predictors and Learning-Augmented Algorithms Crispo is a production-ready, research-grade system for the automated co-design of Learning-Augmented Algorithms (LAA). It transforms high-level objectives into complete, two-part "Solution Packages" containing a machine learning predictor and a specialized algorithm that consumes its predictions. ## 🎯 System Overview The core innovation of Crispo is its ability to bridge the gap between machine learning and classical algorithm design. For a given online problem (e.g., ski rental), it generates: 1. **A Predictor Script:** An ML model (e.g., ARIMA) that learns from historical data to predict future values and quantifies its own uncertainty. 2. **An Algorithm Script:** A Learning-Augmented Algorithm that takes the ML prediction as input and intelligently balances it against a robust worst-case strategy using a trust parameter (λ). The entire system is designed to be autonomous, optimizing its own components and learning from past performance to improve future solutions. ## 🏗️ Core Architecture Crispo is built on a three-tier optimization stack, ensuring a clear separation of concerns: 1. **Genetic Algorithm (Strategic):** The `GAOptimizer` evolves high-level parameters for code generation, searching for the best overall strategy. It now features **adaptive population sizing** for improved efficiency. 2. **Reinforcement Learning (Tactical):** The `RLAgent` fine-tunes the parameters for a specific layer, using a Q-table to learn optimal, context-aware adjustments. The Q-table is now **pruned** to prevent unbounded memory growth. 3. **Attention Mechanism (Coordination):** The `AttentionRouter` allows different layers of the generated pipeline to share information, ensuring a cohesive and well-coordinated final output. This stack feeds into an intent-driven `CodeGenerator` that selects and parameterizes code templates based on the user's objective. ## ✨ Key Features & Innovations ### 1. Learning-Augmented Algorithm (LAA) Co-Design Crispo's primary feature is its end-to-end framework for generating and evaluating LAAs. The system automatically co-designs a predictor and an algorithm that work in tandem. ### 2. Two-Stage "Live" Evaluation To ensure solutions are robust, the `Verifier` performs a rigorous, two-stage evaluation that simulates a real-world deployment: * **Stage 1: Prediction:** The generated predictor is run on historical data to produce a "live" prediction interval. * **Stage 2: Execution:** The generated algorithm is run with the live prediction, and its performance (e.g., `competitive_ratio`) is measured. This methodology is novel and provides a much more realistic assessment than mock evaluations. ### 3. Solution Registry Verified solutions are automatically versioned and saved to the `solution_registry/` directory. This creates a persistent, queryable knowledge base of high-quality solutions. **Example Query:** ```bash python3 crispo.py --query-registry "competitive_ratio:1.2" ``` ### 4. Meta-Learning with UCB1 The `MetaLearner` allows Crispo to learn from its own performance. It has been upgraded from a simple epsilon-greedy strategy to an **Upper Confidence Bound (UCB1)** algorithm, which provides a more principled and efficient balance between exploring new strategies and exploiting known good ones. ## ⚙️ Component Analysis & Recent Improvements - **`GAOptimizer`:** Now uses **adaptive population sizing** to scale its search space based on problem complexity, improving performance. It also evaluates fitness in parallel using a `ProcessPoolExecutor`. - **`RLAgent`:** The Q-table is now **pruned** after each training episode to prevent memory exhaustion in long-running sessions. - **`Verifier`:** Now includes a `PredictorEvaluator` that calculates **Uncertainty Quantification (UQ) metrics** (`coverage_rate` and `interval_sharpness`) for the generated predictor, providing a more complete picture of the solution's quality. - **Security:** Subprocess execution is now sandboxed with **resource limits** to prevent runaway processes, and file writes are validated to prevent **directory traversal attacks**. ## 🚀 Advanced Features ### Bayesian Neural Architecture Search (NAS) The NAS pipeline has been upgraded from a random search to a **Bayesian Optimization** strategy, using Gaussian Processes to intelligently search for optimal neural network architectures. This results in a ~10x speedup in finding near-optimal architectures. ### Federated Optimizer The placeholder `FederatedOptimizer` has been replaced with a functional **Federated Averaging (FedAvg)** implementation, enabling true federated learning across multiple clients. ### Transfer Learning A production-ready, three-step transfer learning pipeline (`load_model`, `apply_model`, `log_to_registry`) is available to transfer knowledge from previously trained models. ## Usage Crispo is a command-line tool. The main entry point is `crispo.py`. ### Basic Example ```bash python3 crispo.py --project "MyDataPipeline" --objective "Fetch data from an API, process it with pandas, and analyze with numpy" ``` ### LAA Co-Design Example To generate a Learning-Augmented Algorithm for the ski rental problem: ```bash python3 crispo.py --project "SkiRentalLAA" --objective "Generate a learning-augmented algorithm for the ski rental problem" --trust-parameter 0.7 ``` *Note: This requires a `ski_rental_history.csv` file in the root directory.* ### Enabling Advanced Features ```bash python3 crispo.py --objective "Optimize a deep learning model" \ --enable-nas \ --enable-transfer-learning \ --enable-federated-optimization ``` ### Saving and Loading Meta-Knowledge You can persist the `MetaLearner`'s state across runs: ```bash # Save the learned state python3 crispo.py --objective "My first run" --save-metaknowledge knowledge.pkl # Load the state for a new run python3 crispo.py --objective "My second run, building on the first" --load-metaknowledge knowledge.pkl ``` ## Licensing `crispo` is licensed under the **GNU Affero General Public License v3.0 (AGPLv3)**. This means you are free to use, modify, and distribute this software for any open-source project that is also licensed under the AGPLv3. For use in a closed-source, proprietary, or commercial application, a separate commercial license is required. Please contact us at `your.email@example.com` to inquire about obtaining a commercial license. ## Testing The project uses the built-in `unittest` framework. To run the full test suite: ```bash python3 -m unittest test_crispo.py ```