LineFormer Fine-tuned on Battery Discharge Curves
This model is a fine-tuned version of LineFormer specialized for detecting and segmenting line charts in battery charge/discharge curve figures from scientific papers.
Model Description
LineFormer is an instance segmentation model designed to detect individual lines in scientific plots. This fine-tuned version has been optimized for battery electrochemistry figures, particularly charge/discharge curves with multiple C-rates.
Base Model: LineFormer (pre-trained on scientific figure datasets) Fine-tuning Dataset: 62 battery charge/discharge curve images from scientific papers Validation Dataset: 19 battery curve images Framework: MMDetection (Mask R-CNN based)
Performance
Evaluated using the original LineFormer evaluation methodology (ICDAR 2023 Task 6a/6b), which matches predicted lines to ground truth lines using linear interpolation and the Hungarian algorithm.
Task 6a / 6b Scores (Original Paper Metrics)
| Model | Task 6a | Task 6b | GT Lines | Detected | Over-detection |
|---|---|---|---|---|---|
| Pre-trained | 0.9471 | 0.6835 | 146 | 237 | +62.3% |
| Fine-tuned Best (iter_1300) | 0.9180 | 0.7394 | 146 | 179 | +22.6% |
| Fine-tuned Final (iter_5000) | 0.9097 | 0.7836 | 146 | 160 | +9.6% |
- Task 6a: Measures how well each GT line is matched (no penalty for extra detections)
- Task 6b: Penalizes over-detection โ the more practical metric
Key improvements (Pre-trained โ Fine-tuned iter_5000):
- Task 6b (practical metric): 0.6835 โ 0.7836 (+14.6%)
- Over-detection reduced from +62.3% to +9.6%
- Reduced false positives from text annotations, legends, and axis labels
- Better handling of densely packed multi-cycle curves
Additional Metrics
| Model | segm_mAP | Confidence Score |
|---|---|---|
| Pre-trained | 0.0395 | 0.791 |
| Fine-tuned Best (iter_1300) | 0.1587 | 0.872 |
| Fine-tuned Final (iter_5000) | 0.1451 | 0.896 |
Usage
Requirements
pip install mmdet mmcv torch torchvision
Inference
from mmdet.apis import init_detector, inference_detector
import mmcv
# Load model
config_file = 'configs/battery_finetune.py'
checkpoint_file = 'lineformer_battery_iter_5000.pth'
model = init_detector(config_file, checkpoint_file, device='cuda:0')
# Inference
img = mmcv.imread('battery_curve.png')
result = inference_detector(model, img)
# Visualize
model.show_result(img, result, out_file='result.jpg', score_thr=0.5)
Model Files
We provide two checkpoints with different strengths:
| Model | Task 6b | segm_mAP | Over-detection | Best For |
|---|---|---|---|---|
| lineformer_battery_iter_5000.pth | 0.7836 | 0.1451 | +9.6% | Data extraction accuracy (recommended) |
| lineformer_battery_best_iter_1300.pth | 0.7394 | 0.1587 | +22.6% | Mask shape precision |
Which model to use?
- For most use cases: Use
lineformer_battery_iter_5000.pthโ highest Task 6b score (0.7836) with fewest false positives - For mask segmentation tasks: Use
lineformer_battery_best_iter_1300.pthif you need precise mask boundaries (higher mAP)
Training Details
Training Hyperparameters
- Optimizer: SGD (lr=0.0001, momentum=0.9, weight_decay=0.0001)
- Learning Rate Schedule: Step decay at iterations [3500, 4750]
- Total Iterations: 5000
- Batch Size: 2
- Image Size: Resized to shorter edge 800px
- Data Augmentation: RandomFlip (flip_ratio=0.5)
- Backbone: ResNet-50 (frozen, pre-trained on ImageNet)
Training Data
- Train: 62 images (battery charge/discharge curves from scientific papers)
- Validation: 19 images
- Line Width: 10px (for mask generation)
- Format: COCO instance segmentation format
Training Metrics
| Iteration | mAP | mAP_50 | mAP_75 |
|---|---|---|---|
| 500 | 0.1156 | 0.2852 | 0.0005 |
| 1300 | 0.1587 | 0.3705 | 0.0006 |
| 5000 | 0.1451 | 0.3503 | 0.0004 |
Note: mAP_75 โ 0 indicates that mask boundary precision needs improvement, though line detection count is accurate.
Limitations and Challenges
The model still struggles with:
- Dense multi-cycle curves (e.g., 30-cycle plots): GT=4 lines โ Detected=14-17 lines
- Dashed lines and annotations: Non-line elements (arrows, dimension lines) may be detected
- Mask IoU precision: Low mAP_75 suggests mask shape accuracy needs improvement
- GT definition mismatch: Some images count charge/discharge pairs as single lines vs. separate lines
Intended Use
This model is designed for:
- Automated extraction of battery performance data from scientific literature
- Digitization of charge/discharge curves for meta-analysis
- Pre-processing battery electrochemistry figures for data mining
Not recommended for:
- General line chart detection (use original LineFormer)
- Real-time applications requiring <100ms inference
- Figures with extreme aspect ratios or unconventional layouts
Citation
If you use this model, please cite the original LineFormer paper:
@inproceedings{Xia_2022_WACV,
title={LineFormer: Rethinking Line Chart Data Extraction as Instance Segmentation},
author={Xia, Weixin and Lo, Kelvin and Chao, Qing and Li, Tong and Zhao, Jian},
booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
year={2022}
}
License
Apache 2.0 (same as base LineFormer model)
Model Card Authors
Fine-tuning and evaluation by t29mato
Model Card Contact
For questions about this fine-tuned model, please open an issue in the repository.