Gemma-4-31B_Opus-Reasoning_BF16

This is a fine-tuned and merged version of the Gemma-4 31B model, trained on the high-quality reasoning dataset Crownelius/Opus-4.6-Reasoning-3300x.

The primary goal of this project was to leverage Gemma-4's native <|channel> architecture to enforce strict, logical step-by-step reasoning before outputting a final answer. By fusing the Gemma-4 foundation with the Opus reasoning dataset, the model acts as a deeply analytical agent capable of planning complex cloud deployments and logical deductions.

Reasoning Format: Gemma 4 Architecture

This model adheres strictly to the Gemma 4 multimodal and reasoning formats. It outputs internal reasoning within <|channel>thought bounds before delivering the final response.

Below are raw, unedited samples from this merge:

User: I am building an automated AI training platform on AWS. I need to generate scalable Infrastructure-as-Code (IaC) templates to dynamically spin up GPU instances. Walk me through how I should decide between using standard Terraform versus the AWS Cloud Development Kit (CDK). Give me the pros and cons, and make a final recommendation assuming my platform logic is written in Python.

Model: <|channel>thought Let me work through this step by step.

Step 1: Understand the Requirements

  • The goal is to create Infrastructure-as-Code for an automated AI training platform... (Context truncation for brevity)
  • The platform logic is written in Python, so the IaC solution should integrate smoothly with that codebase. (Evaluates Terraform vs CDK)

Final Recommendation:

Since the platform is built on Python, and there is no immediate need for multi-cloud support, AWS CDK is the best choice...<channel|> Final Answer: For your automated AI training platform on AWS, I recommend using AWS CDK instead of Terraform. Here's why...


Benchmarks: ARC Challenge

While standard knowledge benchmarks occasionally show minor regression during strict reasoning SFT, the structural output improvements are massive.

  • Base (Gemma-4-31B): acc_norm: 69.88%
  • Merged (Opus Reasoning): acc_norm: 69.54%

Training Details

  • Base Model: google/gemma-4-31B
  • Dataset: Crownelius/Opus-4.6-Reasoning-3300x
  • Training Framework: Eschaton Engine (Cloudbjorn)
  • Format: Merged (Base + LoRA)

Training Precision:

  • Compute Dtype: bfloat16

LoRA Parameters (Auto-Scaled for 31B via Eschaton):

  • r: 16
  • lora_alpha: 32
  • target_modules: all-linear

Hyperparameters:

  • Optimizer: 8-bit Paged AdamW
  • Effective Batch Size: 32 (Gradient Accumulation)
  • Learning Rate: 2e-5
  • LR Scheduler: Linear
  • Epochs: 1
  • Training Sequence Length: 2048
  • Warmup Steps: 50
  • Weight Decay: 0.01
Downloads last month
170
Safetensors
Model size
31B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cloudbjorn/gemma-4-31B-Opus-4.6-Reasoning

Finetuned
(13)
this model

Dataset used to train cloudbjorn/gemma-4-31B-Opus-4.6-Reasoning