Model Card for telecomadm1145/Qwen3-8B-Novel-Merged

This is an instruction finetune of Qwen/Qwen3-8B model, fine-tuned for creative writing tasks (such as novels, stories) in mostly Chinese.

Model Details

Model Description

  • Developed by: telecomadm1145
  • Funded by: None.
  • Model type: Causal Language Model
  • Language(s) (NLP): English, Chinese
  • License: The base model Qwen/Qwen3-8B uses the Tongyi Qianwen LICENSE AGREEMENT. This adapter is likely subject to the same license terms. Please refer to the base model's license for details.
  • Finetuned from model: Qwen/Qwen3-8B

Uses

This model can be used directly for text generation, especially in creative contexts. Potential uses include:

  • Generating story openings, plot developments, or endings.
  • Composing poems, lyrics, or short stories.
  • Serving as a source of inspiration to overcome writer's block.
  • Role-playing and dialogue generation.

Downstream Use

This model is suitable as a foundation for downstream applications, such as:

  • Creative Writing Assistant Apps: Integrate into writing software to provide real-time suggestions and content generation.
  • Game Development: Generate in-game character dialogue, quest descriptions, and lore.
  • Interactive Narratives: Build text-based adventure games or interactive fiction.
  • Content Marketing: Automatically generate creative ad copy or social media posts.

Out-of-Scope Use

This model should not be used for:

  • Generating factually accurate content (e.g., news articles, scientific papers, medical or legal advice).
  • Any malicious purposes, including generating hate speech, discriminatory content, or disinformation.
  • Critical decision-making systems without rigorous evaluation and safety measures.
  • Tasks requiring high-fidelity logical reasoning, such as complex programming or mathematical problem-solving.

Bias, Risks, and Limitations

  • Inherited Bias: The model was trained on the telecomadm1145/creative_writing dataset and may learn and amplify biases present in the data, such as stereotypes in character portrayal or cultural depictions.
  • Factual Unreliability: The generated content is fictional and should never be treated as factual information.
  • Risk of Harmful Content: Although the base model has undergone safety alignment, it may still be possible to prompt the model to generate inappropriate or offensive content.
  • Limited Stylistic Range: The model's writing style is heavily influenced by its training data and may not cover all genres or styles of creative writing.

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases, and limitations of the model. It is recommended to implement content filtering and safety mechanisms when deploying this model in user-facing applications and to clearly disclose that the content is AI-generated.

Training Details

Training Data

This model was fine-tuned on the telecomadm1145/creative_writing dataset. The dataset consists of diverse creative texts, such as novels and stories, designed to enhance the model's narrative and imaginative capabilities. The raw data was processed using a GPT-based model to convert it into high-quality instruction-response pairs for supervised fine-tuning.

Training Procedure

The model was trained using the AdaLoRA method from the Hugging Face peft library. AdaLoRA is an adaptive version of LoRA that dynamically allocates ranks to weight matrices based on their importance, enabling efficient fine-tuning with fewer parameters.

Preprocessing

The training data was formatted into instruction-response pairs. During training, the loss was calculated only on the response (completion) part of the text, ignoring the instruction. This technique encourages the model to focus on learning to generate the desired output.

Training Hyperparameters

  • Training regime: bf16 mixed precision
  • PEFT Method: AdaLoRA
  • Epochs: 3
  • Learning rate: 1e-4
  • LR Scheduler: linear

Speeds, Sizes, Times

  • The model was trained for approximately 6 hours on a single NVIDIA A100 40GB GPU.

Evaluation

No formal evaluation or testing has been conducted on this model.

Testing Data, Factors & Metrics

Testing Data

No formal testing data was used. The model has not been evaluated on any standard benchmarks.

Results

No quantitative results are available as the model has not been formally evaluated. Its performance is best assessed qualitatively by generating text for creative writing tasks.

Summary

No formal evaluation summary is available.

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: 1 x NVIDIA A100 40GB
  • Hours used: ~6 hours
  • Cloud Provider: Google Colab
  • Compute Region: N/A
  • Carbon Emitted: 0.84 kg CO2 eq.

Technical Specifications

Hardware

  • NVIDIA A100 40GB

Software

  • PyTorch
  • Transformers
  • PEFT

Framework versions

  • PEFT 0.17.1
Downloads last month
4
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for telecomadm1145/Qwen3-8B-Novel-Merged

Finetuned
Qwen/Qwen3-8B
Finetuned
(1459)
this model
Quantizations
2 models

Dataset used to train telecomadm1145/Qwen3-8B-Novel-Merged

Collection including telecomadm1145/Qwen3-8B-Novel-Merged

Paper for telecomadm1145/Qwen3-8B-Novel-Merged