Zhihan
/

CharLuMA-1.3B

Image-Text-to-Text

llava_deepseekcoder

Model card Files Files and versions

CharLuMA-1.3B / README.md

Zhihan's picture

Update README.md

d4940d7 verified 7 days ago

|

history blame contribute delete

1.81 kB

	---
	license: cc-by-nc-4.0
	language:
	- en
	base_model:
	- deepseek-ai/deepseek-coder-6.7b-instruct
	- google/siglip-so400m-patch14-384
	pipeline_tag: image-text-to-text
	---

	# Aligned Multi-View Scripts for Universal Chart-to-Code Generation

	CharLuMA-1.3B generates a plotting script in Python, R, or LaTeX from a chart image. This 1.3B variant is from the paper "Aligned Multi-View Scripts for Universal Chart-to-Code Generation" (ACL 2026 Main Conference).

	- Paper: https://arxiv.org/abs/2604.24559
	- Code (required for inference): https://github.com/zhihan72/CharLuMA
	- 6.7B Variant: [CharLuMA-6.7B](https://huggingface.co/Zhihan/CharLuMA-6.7B)
	- Training data: [Chart2NCode](https://huggingface.co/datasets/Zhihan/Chart2NCode)

	\| Backbone \| Vision encoder \| Output languages \| dtype \|
	\|---\|---\|---\|---\|
	\| DeepSeek-Coder-1.3B-Instruct \| SigLIP-SO400M-patch14-384 \| Python, R, LaTeX \| bfloat16 \|

	## Usage

	This is a custom architecture, so `AutoModel.from_pretrained` will not load it.

	Clone the codebase and use its loader:

	```bash
	git clone https://github.com/Zhihan72/CharLuMA
	```

	Before loading, replace the placeholder paths in `config.json` (`/your_local_path/...`) with `deepseek-ai/deepseek-coder-1.3b-instruct` (or a local copy) and `google/siglip-so400m-patch14-384` (or a local copy).

	Inference example: `scripts/inference_charluma.py` in the repo.

	## Citation

	If you find our work useful, consider citing our paper as follows:

	```bibtex
	@misc{zhang2026aligned,
	title = {Aligned Multi-View Scripts for Universal Chart-to-Code Generation},
	author = {Zhihan Zhang and Lizi Liao},
	year = {2026},
	eprint = {2604.24559},
	archivePrefix = {arXiv},
	primaryClass = {cs.CL},
	url = {https://arxiv.org/abs/2604.24559}
	}
	```