FinBloom 7B is a specialized Large Language Model engineered for high-performance financial NLP applications. Built upon the BLOOM-7B architecture, the model has undergone extensive fine-tuning using a robust corpus of historical financial data, including comprehensive news archives from Reuters and DPA, as well as several years of official SEC filings. This targeted training enables the model to excel at complex downstream financial analysis and domain-specific reasoning.
Breakup of datasets used for finetuning
| Dataset | Documents | Mean Words | Mean Tokens | Time Period |
|---|---|---|---|---|
| Reuters news | 14,574,641 | 369.23 | 459.09 | 1st Jan 2003-31st Dec 2012 |
| DPA news | 387,187 | 286.20 | 390.37 | 1st Jun 2001-31st May 2011 |
| SEC filings | 12,238,570 | 379.96 | 536.56 | 31st Mar 2009-31st Oct 2023 |
How to Get Started with the Model
Use the code below to get started with the model.
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
peft_model_id = "Chaitanya14/FinBloom_7B"
config = PeftConfig.from_pretrained(peft_model_id)
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path)
model = PeftModel.from_pretrained(model, peft_model_id)
Citation
If you use the FinBloom 7B LLM, please cite with the following BibTex entry:
@article{SINHA2026115559,
title = {FinBloom: Knowledge-Grounding Large Language Model with Real-Time Financial Data},
journal = {Knowledge-Based Systems},
volume = {339},
pages = {115559},
year = {2026},
issn = {0950-7051},
doi = {https://doi.org/10.1016/j.knosys.2026.115559},
url = {https://www.sciencedirect.com/science/article/pii/S0950705126003011},
author = {Ankur Sinha and Chaitanya Agarwal and Pekka Malo},
keywords = {Financial large language model, Generative pre-Trained transformer, Knowledge grounding, Natural language processing},
abstract = {Large language models (LLMs) excel at generating human-like responses but often struggle with interactive tasks that require access to real-time information. This limitation poses challenges in finance, where models must access up-to-date information, such as recent news or price movements, to support decision-making. To address this, we introduce Financial Agent, a knowledge-grounding approach for LLMs to handle financial queries using real-time text and tabular data. Our contributions are threefold: First, we develop a Financial Context Dataset of over 50,000 financial queries paired with the required context. Second, we develop FinBloom 7B, a custom 7 billion parameter LLM, by fine-tuning Bloom 7B on 14 million financial news articles from Reuters and Deutsche Presse-Agentur (DPA), alongside a random sample of 25% from 12 million Securities and Exchange Commission (SEC) filings. Third, we fine-tune FinBloom 7B using the Financial Context Dataset to serve as a Financial Agent. This agent generates relevant financial context, enabling efficient real-time data retrieval to answer user queries. By reducing latency and eliminating the need for users to manually provide accurate data, our approach significantly enhances the capability of LLMs to handle dynamic financial tasks. Our proposed approach makes real-time financial decisions, algorithmic trading and other related tasks streamlined, and is valuable in contexts with high-velocity data flows.}
}
- Downloads last month
- 2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for Chaitanya14/FinBloom_7B
Base model
bigscience/bloom-7b1