Spaces:

Humanlearning
/

Cyber_analyst-round1

Sleeping

App Files Files Community

Cyber_analyst-round1

Commit History

Sync README training commands

0b1f1bd
verified

Humanlearning commited on 12 days ago

Sync updated mini-blog

401c7b1
verified

Humanlearning commited on 12 days ago

Add Trackio ablation blog image

d531538
verified

Humanlearning commited on 12 days ago

Fix Space Docker build config files

736ab03
verified

Humanlearning commited on 12 days ago

Update blog/blog.md

e708ce9
verified

Humanlearning commited on 12 days ago

Update blog/blog.md

b0a3f92
verified

Humanlearning commited on 12 days ago

chore: sync local CyberSecurity_OWASP with HF Space history

9852074

Humanlearning commited on 12 days ago

docs: add architecture and RL training flow diagrams to README and blog for improved clarity on system design

b2ee80d

Humanlearning commited on 12 days ago

Refine mini-blog title

735ebfa

Humanlearning commited on 12 days ago

Add mini-blog

54079d2

Humanlearning commited on 12 days ago

feat: introduce GRPO GPU fallback support, enhance training script with warmstart tagging, and add learning rate parameter for improved training flexibility

1b6d30b

Humanlearning commited on 12 days ago

feat: enhance SFT training process with new tokenization method, implement custom trainer class for loss computation, and update README with GRPO launcher details for Unsloth LoRA integration

e5fe6f5

Humanlearning commited on 12 days ago

fix: update README with SFT training configuration details, modify modal training scripts to disable assistant-only loss and packing for compatibility, and adjust test assertions to reflect these changes

1544ce8

Humanlearning commited on 12 days ago

feat: expand README with synthetic SFT dataset generation instructions, enhance dataset verification and pushing to Hugging Face Hub, and improve modal training scripts with default configurations for curriculum and GPU fallback

60f97ab

Humanlearning commited on 12 days ago

feat: introduce reward ablation configurations for enhanced training flexibility, implement YAML loading with extends support, and add reward variant tracking in training scripts

f7b8ac6

Humanlearning commited on 12 days ago

diagrams updated

5809a6c

Humanlearning commited on 12 days ago

feat: enhance reward configuration management with new logging functions, add parallel Modal training guidelines to documentation, and improve reward config hashing for deterministic behavior

0e7f59c

Humanlearning commited on 12 days ago

feat: update README with GPU-utilization tuning instructions, enhance modal training script with run name parameter, and modify GRPO configuration for trace logging and vLLM settings

7d32451

Humanlearning commited on 12 days ago

feat: enhance CyberSecurity_OWASP observation model with scenario prompt, improve GRPO batch configuration validation, and add scenario grouping for adaptive difficulty curriculum

632c145

Humanlearning commited on 12 days ago

feat: add episode trace fingerprinting for improved trace logging and update reward penalties in GRPO configuration

2eada22

Humanlearning commited on 12 days ago

feat: enhance scenario authoring and caching mechanisms, update action submission terminology, and improve reward configuration for CyberSecurity_OWASP environment

be8eade

Humanlearning commited on 12 days ago

feat: enhance training image setup and add startup notice for Modal execution, improve dependency installation process, and implement training heartbeat for monitoring

448eddd

Humanlearning commited on 13 days ago

feat: update training configuration and documentation for Modal execution, including new model integration and enhanced tracking utilities

b3ee507

Humanlearning commited on 13 days ago

feat: add cybersecurity-owasp-trainer skill with reference notes and update AGENTS.md documentation

28685f3

Humanlearning commited on 13 days ago

feat: implement RL environment server with training infrastructure and Modal integration

6abc8c5

Humanlearning commited on 13 days ago

feat: integrate Trackio for experiment tracking and add Modal training infrastructure with environment and test utilities.

4e663d8

Humanlearning commited on 13 days ago

feat: implement GRPO training pipeline with modal integration and supporting web services

0ff6d8a

Humanlearning commited on 13 days ago

feat: integrate Trackio for experiment tracking, add GRPO training support, and deploy web-based monitoring tools

0e95d4f

Humanlearning commited on 13 days ago

feat: implement core RL training infrastructure and architecture documentation

f3080d1

Humanlearning commited on 13 days ago

feat: implement core RL training infrastructure, including GRPO training, evaluation utilities, custom environments, and Modal-based execution scripts.

3807ea3

Humanlearning commited on 13 days ago

docs: add initial project brief and architecture documentation for CyberSecurity_OWASP environment

06bfd31

Humanlearning commited on 13 days ago

Initial commit

287d681

Humanlearning commited on 13 days ago

Upload folder using huggingface_hub

63a6397
verified

Humanlearning commited on 26 days ago

Upload folder using huggingface_hub

252f427
verified

Humanlearning commited on 26 days ago

Upload folder using huggingface_hub

00095ba
verified

Humanlearning commited on 26 days ago

initial commit

5809cd1
verified

Humanlearning commited on 26 days ago

Commit History

Sync README training commands 0b1f1bd verified

Sync updated mini-blog 401c7b1 verified

Add Trackio ablation blog image d531538 verified

Fix Space Docker build config files 736ab03 verified

Update blog/blog.md e708ce9 verified

Update blog/blog.md b0a3f92 verified

chore: sync local CyberSecurity_OWASP with HF Space history 9852074

docs: add architecture and RL training flow diagrams to README and blog for improved clarity on system design b2ee80d

Refine mini-blog title 735ebfa

Add mini-blog 54079d2

feat: introduce GRPO GPU fallback support, enhance training script with warmstart tagging, and add learning rate parameter for improved training flexibility 1b6d30b

feat: enhance SFT training process with new tokenization method, implement custom trainer class for loss computation, and update README with GRPO launcher details for Unsloth LoRA integration e5fe6f5

fix: update README with SFT training configuration details, modify modal training scripts to disable assistant-only loss and packing for compatibility, and adjust test assertions to reflect these changes 1544ce8

feat: expand README with synthetic SFT dataset generation instructions, enhance dataset verification and pushing to Hugging Face Hub, and improve modal training scripts with default configurations for curriculum and GPU fallback 60f97ab

feat: introduce reward ablation configurations for enhanced training flexibility, implement YAML loading with extends support, and add reward variant tracking in training scripts f7b8ac6

diagrams updated 5809a6c

feat: enhance reward configuration management with new logging functions, add parallel Modal training guidelines to documentation, and improve reward config hashing for deterministic behavior 0e7f59c

feat: update README with GPU-utilization tuning instructions, enhance modal training script with run name parameter, and modify GRPO configuration for trace logging and vLLM settings 7d32451

feat: enhance CyberSecurity_OWASP observation model with scenario prompt, improve GRPO batch configuration validation, and add scenario grouping for adaptive difficulty curriculum 632c145

feat: add episode trace fingerprinting for improved trace logging and update reward penalties in GRPO configuration 2eada22

feat: enhance scenario authoring and caching mechanisms, update action submission terminology, and improve reward configuration for CyberSecurity_OWASP environment be8eade

feat: enhance training image setup and add startup notice for Modal execution, improve dependency installation process, and implement training heartbeat for monitoring 448eddd

feat: update training configuration and documentation for Modal execution, including new model integration and enhanced tracking utilities b3ee507

feat: add cybersecurity-owasp-trainer skill with reference notes and update AGENTS.md documentation 28685f3

feat: implement RL environment server with training infrastructure and Modal integration 6abc8c5

feat: integrate Trackio for experiment tracking and add Modal training infrastructure with environment and test utilities. 4e663d8

feat: implement GRPO training pipeline with modal integration and supporting web services 0ff6d8a

feat: integrate Trackio for experiment tracking, add GRPO training support, and deploy web-based monitoring tools 0e95d4f

feat: implement core RL training infrastructure and architecture documentation f3080d1

feat: implement core RL training infrastructure, including GRPO training, evaluation utilities, custom environments, and Modal-based execution scripts. 3807ea3

docs: add initial project brief and architecture documentation for CyberSecurity_OWASP environment 06bfd31

Initial commit 287d681

Upload folder using huggingface_hub 63a6397 verified

Upload folder using huggingface_hub 252f427 verified

Upload folder using huggingface_hub 00095ba verified

initial commit 5809cd1 verified

Sync README training commands

0b1f1bd
verified

Sync updated mini-blog

401c7b1
verified

Add Trackio ablation blog image

d531538
verified

Fix Space Docker build config files

736ab03
verified

Update blog/blog.md

e708ce9
verified

Update blog/blog.md

b0a3f92
verified

chore: sync local CyberSecurity_OWASP with HF Space history

9852074

docs: add architecture and RL training flow diagrams to README and blog for improved clarity on system design

b2ee80d

Refine mini-blog title

735ebfa

Add mini-blog

54079d2

feat: introduce GRPO GPU fallback support, enhance training script with warmstart tagging, and add learning rate parameter for improved training flexibility

1b6d30b

feat: enhance SFT training process with new tokenization method, implement custom trainer class for loss computation, and update README with GRPO launcher details for Unsloth LoRA integration

e5fe6f5

fix: update README with SFT training configuration details, modify modal training scripts to disable assistant-only loss and packing for compatibility, and adjust test assertions to reflect these changes

1544ce8

feat: expand README with synthetic SFT dataset generation instructions, enhance dataset verification and pushing to Hugging Face Hub, and improve modal training scripts with default configurations for curriculum and GPU fallback

60f97ab

feat: introduce reward ablation configurations for enhanced training flexibility, implement YAML loading with extends support, and add reward variant tracking in training scripts

f7b8ac6

diagrams updated

5809a6c

feat: enhance reward configuration management with new logging functions, add parallel Modal training guidelines to documentation, and improve reward config hashing for deterministic behavior

0e7f59c

feat: update README with GPU-utilization tuning instructions, enhance modal training script with run name parameter, and modify GRPO configuration for trace logging and vLLM settings

7d32451

feat: enhance CyberSecurity_OWASP observation model with scenario prompt, improve GRPO batch configuration validation, and add scenario grouping for adaptive difficulty curriculum

632c145

feat: add episode trace fingerprinting for improved trace logging and update reward penalties in GRPO configuration

2eada22

feat: enhance scenario authoring and caching mechanisms, update action submission terminology, and improve reward configuration for CyberSecurity_OWASP environment

be8eade

feat: enhance training image setup and add startup notice for Modal execution, improve dependency installation process, and implement training heartbeat for monitoring

448eddd

feat: update training configuration and documentation for Modal execution, including new model integration and enhanced tracking utilities

b3ee507

feat: add cybersecurity-owasp-trainer skill with reference notes and update AGENTS.md documentation

28685f3

feat: implement RL environment server with training infrastructure and Modal integration

6abc8c5

feat: integrate Trackio for experiment tracking and add Modal training infrastructure with environment and test utilities.

4e663d8

feat: implement GRPO training pipeline with modal integration and supporting web services

0ff6d8a

feat: integrate Trackio for experiment tracking, add GRPO training support, and deploy web-based monitoring tools

0e95d4f

feat: implement core RL training infrastructure and architecture documentation

f3080d1

feat: implement core RL training infrastructure, including GRPO training, evaluation utilities, custom environments, and Modal-based execution scripts.

3807ea3

docs: add initial project brief and architecture documentation for CyberSecurity_OWASP environment

06bfd31

Initial commit

287d681

Upload folder using huggingface_hub

63a6397
verified

Upload folder using huggingface_hub

252f427
verified

Upload folder using huggingface_hub

00095ba
verified

initial commit

5809cd1
verified