| # BitTransformerLM Scripts |
|
|
| This directory contains organized scripts for BitTransformerLM development, training, and evaluation. |
|
|
| ## Directory Structure |
|
|
| ``` |
| scripts/ |
| βββ training/ # Training scripts and experiments |
| βββ examples/ # Example usage and demonstrations |
| βββ testing/ # Test scripts and validation |
| βββ benchmarks/ # Performance benchmarks |
| βββ tools/ # Utility scripts and data processing |
| ``` |
|
|
| ## Training Scripts (`training/`) |
|
|
| - **basic_training.py** - Simple training setup for small models |
| - **breakthrough_training.py** - Advanced training with breakthrough techniques |
| - **cpu_edge_training.py** - CPU-optimized training for edge deployment |
| - **final_breakthrough_training.py** - Production training pipeline |
| - **full_attention_training.py** - Full attention mechanism training |
| - **full_bits_train.py** - Complete bit-level training |
| - **production_training.py** - Production-ready training script |
| - **progressive_scaleup.py** - Progressive model scaling |
| - **quick_training_run.py** - Fast training for development |
|
|
| ## Example Scripts (`examples/`) |
|
|
| - **example.py** - Basic usage example |
| - **better_sampling.py** - Advanced sampling techniques |
| - **debug_generation.py** - Generation debugging utilities |
| - **raw_generation.py** - Low-level generation examples |
| - **simple_test.py** - Simple model testing |
|
|
| ## Testing Scripts (`testing/`) |
|
|
| - **code_test.py** - Code functionality testing |
| - **diffusion_tests.py** - Diffusion mode testing |
| - **enhanced_generation_test.py** - Advanced generation testing |
| - **full_attention_inference_test.py** - Attention mechanism tests |
| - **test_conversation.py** - Conversational AI testing |
|
|
| ## Benchmark Scripts (`benchmarks/`) |
|
|
| - **wikitext_benchmark.py** - WikiText dataset benchmarking |
| - **wikitext_schedule.py** - WikiText training schedule |
|
|
| ## Utility Tools (`tools/`) |
|
|
| - **build_full_bits.py** - Bit sequence construction |
| - **create_dataset.py** - Dataset creation utilities |
| - **enhanced_checkpoint_system.py** - Advanced checkpointing |
| - **integration_flow.py** - Integration workflow |
| - **integration_schedule.py** - Integration scheduling |
| - **sync_to_hf.py** - HuggingFace synchronization |
| - **unified_workflow.py** - Unified training workflow |
| - **watcher.py** - File system monitoring |
|
|
| ## Usage |
|
|
| All scripts support the standardized CLI interface provided by `bit_transformer.cli_standards`. Use `--help` with any script to see available options. |
|
|
| ### Quick Start |
|
|
| ```bash |
| # Train a small model |
| python scripts/training/basic_training.py --model-size small --epochs 5 |
| |
| # Run a simple test |
| python scripts/examples/simple_test.py --d-model 64 |
| |
| # Benchmark on WikiText |
| python scripts/benchmarks/wikitext_benchmark.py --dataset-name wikitext-2 |
| ``` |
|
|
| ### Environment Variables |
|
|
| Scripts support configuration via environment variables with `BT_` prefix: |
|
|
| ```bash |
| export BT_D_MODEL=128 |
| export BT_NUM_LAYERS=4 |
| export BT_BATCH_SIZE=16 |
| python scripts/training/basic_training.py |
| ``` |
|
|
| ## Development Guidelines |
|
|
| - All scripts should use `bit_transformer.cli_standards` for argument parsing |
| - Include proper logging and error handling |
| - Support both CPU and GPU execution |
| - Follow the naming conventions established in existing scripts |
| - Add documentation for any new hyperparameters or features |