Jashan887 commited on
Commit
08d3bdc
·
verified ·
1 Parent(s): 973c753

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ adapters/tokenizer.json filter=lfs diff=lfs merge=lfs -text
37
+ bugtraceai-core-pro.gguf filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,110 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ base_model: unsloth/Mistral-Nemo-Instruct-2407-bnb-4bit
6
+ tags:
7
+ - cybersecurity
8
+ - application-security
9
+ - pentesting
10
+ - bug-bounty
11
+ - security-reporting
12
+ - gguf
13
+ ---
14
+
15
+ # BugTraceAI-CORE-Pro (12B)
16
+
17
+ A higher-capacity security engineering model from BugTraceAI, tuned for deeper analysis, professional reporting, exploit-chain review, and long-context investigation across agentic web pentesting workflows.
18
+
19
+ ## Model Overview
20
+
21
+ | Field | Value |
22
+ | --- | --- |
23
+ | Organization | BugTraceAI |
24
+ | Framework | BugTraceAI agentic web pentesting framework |
25
+ | Variant | BugTraceAI-CORE-Pro |
26
+ | Parameter Scale | 12B |
27
+ | Architecture | Mistral Nemo |
28
+ | Intended Domain | Application security and authorized security research |
29
+ | Primary Delivery Format | GGUF |
30
+
31
+ ## Intended Use
32
+
33
+ - End-to-end analysis of web application findings in authorized environments.
34
+ - Drafting professional vulnerability reports and remediation guidance.
35
+ - Reasoning over larger technical contexts such as logs, source code, and findings bundles.
36
+
37
+ ## Out-of-Scope Use
38
+
39
+ - Autonomous offensive operation against unauthorized targets.
40
+ - Replacing human validation for severity, exploitability, or business impact.
41
+ - Guaranteeing exploit reliability across target-specific environments.
42
+
43
+ ## Training Data Summary
44
+
45
+ This model was tuned for security engineering workflows using a curated mix of public, security-focused material. The training mix is described at a high level below:
46
+
47
+ - Public vulnerability writeups and disclosed security reports used to improve structure, reasoning, and reporting quality.
48
+ - Security methodology material used to improve triage, reproduction planning, and remediation-oriented analysis.
49
+ - Domain examples covering common web application security patterns, defensive controls, and scanner-style findings.
50
+
51
+ The card intentionally describes the data at a summary level. It should not be read as a guarantee of exact coverage for any individual product, CVE, target stack, or technique.
52
+
53
+ ## Prompting Guidance
54
+
55
+ Recommended prompting style:
56
+
57
+ - State the environment and authorization context clearly.
58
+ - Provide concrete evidence: request, response, stack details, logs, code snippets, or scan output.
59
+ - Ask for one task at a time: triage, reproduction planning, impact analysis, remediation, or reporting.
60
+
61
+ Example tasks that fit this model:
62
+
63
+ - Summarize why this finding is likely valid and what evidence is missing.
64
+ - Rewrite this scanner output into a concise engineering ticket.
65
+ - Draft remediation steps for this authorization bug or input validation issue.
66
+
67
+ ### Ollama Example
68
+
69
+ ```dockerfile
70
+ FROM hf.co/BugTraceAI/BugTraceAI-CORE-Pro
71
+
72
+ SYSTEM """
73
+ You are BugTraceAI-CORE-Pro, a security engineering assistant for authorized testing,
74
+ triage, and remediation support. Prefer precise technical analysis, state assumptions,
75
+ and separate confirmed evidence from hypotheses.
76
+ """
77
+
78
+ PARAMETER temperature 0.1
79
+ PARAMETER top_p 0.9
80
+ ```
81
+
82
+ Create the local model with:
83
+
84
+ ```bash
85
+ ollama create bugtrace-pro -f Modelfile
86
+ ```
87
+
88
+ ## Strengths
89
+
90
+ - Better long-context reasoning and report quality than the Fast variant.
91
+ - More suitable for multi-step analysis and vulnerability writeups.
92
+ - Stronger at connecting findings, evidence, and remediation paths.
93
+
94
+ ## Limitations
95
+
96
+ - Higher latency and resource requirements than the Fast model.
97
+ - Still requires human review for high-risk decisions and disclosure quality.
98
+ - Performance depends on prompt quality and the evidence provided.
99
+
100
+ ## Evaluation Status
101
+
102
+ This release is currently documented with qualitative positioning rather than a public benchmark suite. If you rely on the model for production workflows, validate it against your own prompt set, evidence format, and report quality bar.
103
+
104
+ ## Safety and Responsible Use
105
+
106
+ This model is intended for authorized security work, defensive research, education, and engineering support. Users are responsible for ensuring legal authorization, validating outputs, and applying human review before acting on model-generated analysis.
107
+
108
+ ## License
109
+
110
+ Apache-2.0.
adapters/README.md ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - es
5
+ license: apache-2.0
6
+ base_model: unsloth/Qwen2.5-Coder-7B-Instruct-bnb-4bit
7
+ tags:
8
+ - bug-bounty
9
+ - security
10
+ - pentesting
11
+ - exploit-generation
12
+ - waf-bypass
13
+ - cybersecurity
14
+ - hacking
15
+ model-index:
16
+ - name: BugTraceAI-CORE-v1
17
+ results: []
18
+ ---
19
+
20
+ # 🛡️ BugTraceAI-CORE v1.0
21
+
22
+ BugTraceAI-CORE is a specialized Large Language Model (LLM) fine-tuned for high-performance, private, and local cybersecurity operations. Developed specifically for bug hunters, pentesters, and security researchers, it bridges the gap between general-purpose coding assistants and offensive security experts.
23
+
24
+ ## 🚀 Key Features
25
+
26
+ - **Offensive Security Expertise:** Fine-tuned on real-world exploit chains, WAF bypasses, and security methodologies.
27
+ - **Local-First Architecture:** Designed to run on consumer-grade GPUs (RTX 3060+) with a high-availability fallback for dual-Xeon CPU environments.
28
+ - **2025/2026 Ready:** Trained on recent vulnerability write-ups and disclosed reports to ensure relevance against modern 2025/2026 defense systems.
29
+ - **Zero-Downtime MLOps:** Integrated with a secondary CPU fallback using `llama.cpp` for 24/7 availability during re-training cycles.
30
+
31
+ ## 🧠 Training & Methodology
32
+
33
+ The model was built using the **Unsloth** library for optimized QLoRA training on a single RTX 3060 (12GB VRAM).
34
+
35
+ ### Datasets (The Hacker's Brain)
36
+
37
+ - **WAF Evasion & Injection:** Trained on `darkknight25/WAF_DETECTION_DATASET` for generating payloads that bypass modern Web Application Firewalls.
38
+ - **Security Methodology:** Trained on `AYI-NEDJIMI/bug-bounty-pentest-en` to master the logical structure of pentesting logs and methodology.
39
+ - **Real-World Experience:** Augmented with **HackerOne Disclosed Reports** (scraped from Hacktivity) and curated **GitHub Writeups (2025-2026)** to learn successful exploit chains.
40
+ - **Architectural Foundation:** Follows the implementation principles of _Sebastian Raschka's "LLMs from scratch"_.
41
+
42
+ ### Technical Specs
43
+
44
+ - **Base Model:** Qwen2.5-Coder-7B-Instruct
45
+ - **Fine-Tuning:** QLoRA (Rank 64, Alpha 64)
46
+ - **Context Window:** 4096 Tokens
47
+ - **Precision:** bfloat16 (Optimized for NVIDIA Ampere architecture)
48
+
49
+ ## 🛠️ Usage (BugTraceAI-CLI Integration)
50
+
51
+ BugTraceAI-CORE is designed to work as a plug-and-play replacement for external APIs.
52
+
53
+ ```bash
54
+ # Example environment configuration
55
+ export OPENROUTER_BASE_URL="http://your-local-core:8000/v1"
56
+ export OPENROUTER_API_KEY="sk-bugtrace-local-core"
57
+ ```
58
+
59
+ ### System Architecture
60
+
61
+ - **Port 8000: Gateway (FastAPI)** - Intelligent router that directs traffic.
62
+ - **Port 8001: GPU Node (vLLM)** - High-speed primary inference.
63
+ - **Port 8002: CPU Node (Llama.cpp)** - Reliable fallback for the Dual Xeon.
64
+
65
+ ## ⚠️ Disclaimer
66
+
67
+ BugTraceAI-CORE is intended for **legal ethical hacking and educational purposes only**. The creators are not responsible for any misuse of this tool. Always ensure you have explicit permission before testing any system.
68
+
69
+ ---
70
+
71
+ _Created as part of the BugTraceAI Ecosystem. Building a more secure web, one report at a time._
adapters/adapter_config.json ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alora_invocation_tokens": null,
3
+ "alpha_pattern": {},
4
+ "arrow_config": null,
5
+ "auto_mapping": {
6
+ "base_model_class": "MistralForCausalLM",
7
+ "parent_library": "transformers.models.mistral.modeling_mistral",
8
+ "unsloth_fixed": true
9
+ },
10
+ "base_model_name_or_path": "unsloth/Mistral-Nemo-Instruct-2407-bnb-4bit",
11
+ "bias": "none",
12
+ "corda_config": null,
13
+ "ensure_weight_tying": false,
14
+ "eva_config": null,
15
+ "exclude_modules": null,
16
+ "fan_in_fan_out": false,
17
+ "inference_mode": true,
18
+ "init_lora_weights": true,
19
+ "layer_replication": null,
20
+ "layers_pattern": null,
21
+ "layers_to_transform": null,
22
+ "loftq_config": {},
23
+ "lora_alpha": 32,
24
+ "lora_bias": false,
25
+ "lora_dropout": 0,
26
+ "megatron_config": null,
27
+ "megatron_core": "megatron.core",
28
+ "modules_to_save": null,
29
+ "peft_type": "LORA",
30
+ "peft_version": "0.18.1",
31
+ "qalora_group_size": 16,
32
+ "r": 16,
33
+ "rank_pattern": {},
34
+ "revision": null,
35
+ "target_modules": [
36
+ "q_proj",
37
+ "o_proj",
38
+ "down_proj",
39
+ "up_proj",
40
+ "v_proj",
41
+ "gate_proj",
42
+ "k_proj"
43
+ ],
44
+ "target_parameters": null,
45
+ "task_type": "CAUSAL_LM",
46
+ "trainable_token_indices": null,
47
+ "use_dora": false,
48
+ "use_qalora": false,
49
+ "use_rslora": false
50
+ }
adapters/adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:15c6eb93bd4415d5f977b54390ef2116bc1aab47da118e11f7f10c421efc2e18
3
+ size 228140600
adapters/chat_template.jinja ADDED
@@ -0,0 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {%- if messages[0]["role"] == "system" %}
2
+ {%- set system_message = messages[0]["content"] %}
3
+ {%- set loop_messages = messages[1:] %}
4
+ {%- else %}
5
+ {%- set loop_messages = messages %}
6
+ {%- endif %}
7
+ {%- if not tools is defined %}
8
+ {%- set tools = none %}
9
+ {%- endif %}
10
+ {%- set user_messages = loop_messages | selectattr("role", "equalto", "user") | list %}
11
+
12
+ {#- This block checks for alternating user/assistant messages, skipping tool calling messages #}
13
+ {%- set ns = namespace() %}
14
+ {%- set ns.index = 0 %}
15
+ {%- for message in loop_messages %}
16
+ {%- if not (message.role == "tool" or message.role == "tool_results" or (message.tool_calls is defined and message.tool_calls is not none)) %}
17
+ {%- if (message["role"] == "user") != (ns.index % 2 == 0) %}
18
+ {{- raise_exception("After the optional system message, conversation roles must alternate user/assistant/user/assistant/...") }}
19
+ {%- endif %}
20
+ {%- set ns.index = ns.index + 1 %}
21
+ {%- endif %}
22
+ {%- endfor %}
23
+
24
+ {{- bos_token }}
25
+ {%- for message in loop_messages %}
26
+ {%- if message["role"] == "user" %}
27
+ {%- if tools is not none and (message == user_messages[-1]) %}
28
+ {{- "[AVAILABLE_TOOLS][" }}
29
+ {%- for tool in tools %}
30
+ {%- set tool = tool.function %}
31
+ {{- '{"type": "function", "function": {' }}
32
+ {%- for key, val in tool.items() if key != "return" %}
33
+ {%- if val is string %}
34
+ {{- '"' + key + '": "' + val + '"' }}
35
+ {%- else %}
36
+ {{- '"' + key + '": ' + val|tojson }}
37
+ {%- endif %}
38
+ {%- if not loop.last %}
39
+ {{- ", " }}
40
+ {%- endif %}
41
+ {%- endfor %}
42
+ {{- "}}" }}
43
+ {%- if not loop.last %}
44
+ {{- ", " }}
45
+ {%- else %}
46
+ {{- "]" }}
47
+ {%- endif %}
48
+ {%- endfor %}
49
+ {{- "[/AVAILABLE_TOOLS]" }}
50
+ {%- endif %}
51
+ {%- if loop.last and system_message is defined %}
52
+ {{- "[INST]" + system_message + "\n\n" + message["content"] + "[/INST]" }}
53
+ {%- else %}
54
+ {{- "[INST]" + message["content"] + "[/INST]" }}
55
+ {%- endif %}
56
+ {%- elif (message.tool_calls is defined and message.tool_calls is not none) %}
57
+ {{- "[TOOL_CALLS][" }}
58
+ {%- for tool_call in message.tool_calls %}
59
+ {%- set out = tool_call.function|tojson %}
60
+ {{- out[:-1] }}
61
+ {%- if not tool_call.id is defined or tool_call.id|length != 9 %}
62
+ {{- raise_exception("Tool call IDs should be alphanumeric strings with length 9!") }}
63
+ {%- endif %}
64
+ {{- ', "id": "' + tool_call.id + '"}' }}
65
+ {%- if not loop.last %}
66
+ {{- ", " }}
67
+ {%- else %}
68
+ {{- "]" + eos_token }}
69
+ {%- endif %}
70
+ {%- endfor %}
71
+ {%- elif message["role"] == "assistant" %}
72
+ {{- message["content"] + eos_token}}
73
+ {%- elif message["role"] == "tool_results" or message["role"] == "tool" %}
74
+ {%- if message.content is defined and message.content.content is defined %}
75
+ {%- set content = message.content.content %}
76
+ {%- else %}
77
+ {%- set content = message.content %}
78
+ {%- endif %}
79
+ {{- '[TOOL_RESULTS]{"content": ' + content|string + ", " }}
80
+ {%- if not message.tool_call_id is defined or message.tool_call_id|length != 9 %}
81
+ {{- raise_exception("Tool call IDs should be alphanumeric strings with length 9!") }}
82
+ {%- endif %}
83
+ {{- '"call_id": "' + message.tool_call_id + '"}[/TOOL_RESULTS]' }}
84
+ {%- else %}
85
+ {{- raise_exception("Only user and assistant roles are supported, with the exception of an initial optional system message!") }}
86
+ {%- endif %}
87
+ {%- endfor %}
adapters/special_tokens_map.json ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "</s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "<pad>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "unk_token": {
24
+ "content": "<unk>",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ }
30
+ }
adapters/tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b0240ce510f08e6c2041724e9043e33be9d251d1e4a4d94eb68cd47b954b61d2
3
+ size 17078292
adapters/tokenizer_config.json ADDED
The diff for this file is too large to render. See raw diff
 
bugtraceai-core-pro.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1856d5a261e0a4f61f888341537f34deeb0bfdc576b8f1fc7eb0552c07e29d28
3
+ size 7477207520