| --- |
| language: en |
| license: apache-2.0 |
| base_model: black-forest-labs/FLUX.2-klein-base-4B |
| library_name: diffusers |
| tags: |
| - rppg |
| - remote-photoplethysmography |
| - cardiac-signal |
| - lora |
| - flux2 |
| - vision-banana |
| - arxiv:2604.20329 |
| pipeline_tag: image-to-image |
| --- |
| |
| # pulse-plantain |
|
|
| A LoRA adapter on FLUX.2 Klein 4B for remote photoplethysmography. Given a short clip of a face, the model emits a per-pixel pulse-color field decodable to instantaneous heart rate (BPM). Companion to *Image Generators are Generalist Vision Learners* (Gabeur et al., 2026; [arXiv:2604.20329](https://arxiv.org/abs/2604.20329)). |
|
|
| ## Hypothesis |
|
|
| Remote photoplethysmography (rPPG) recovers cardiac signals from sub-perceptual color variation in skin pixels. The literature spans more than fifteen years and is dominated by hand-engineered pipelines: skin-region detection, RGB-to-chrominance projections, bandpass filtering, blind source separation (Verkruysse 2008; Poh et al. 2010), and recent end-to-end CNNs (DeepPhys, MTTS-CAN). All pipelines impose explicit signal-processing inductive biases. |
|
|
| The test here is whether a general image generator carries enough latent representation of skin micro-color dynamics that a parameter-efficient adapter can produce the per-pixel pulse field directly, without an explicit bandpass + ICA pipeline. The output is a 2D color visualization invertible back to a metric BPM trace via a fixed bijection (analogous to deep-plantain's metric-depth encoding). |
|
|
| ## Method |
|
|
| Input: a short RGB face clip (5–10 seconds at standard frame rate). Output: a per-pixel pulse-color visualization frame in which color encodes the instantaneous BVP signal phase and amplitude over the local skin region. Decoded by inverting the bijection to recover a per-region BVP trace, from which BPM is estimated by peak-counting or FFT. |
|
|
| Training data is drawn from public rPPG datasets with synchronized ground-truth BVP (UBFC-rPPG, PURE, COHFACE, MMPD). Each training example pairs (a) a windowed face clip with (b) the corresponding ground-truth BVP trace rendered as the bijection-encoded RGB target. Evaluation reports BPM mean absolute error against held-out subjects. |
|
|
| ## Status |
|
|
| Placeholder. Weights and training data forthcoming. |
|
|
| ## License |
|
|
| Apache 2.0 — matches base FLUX.2 Klein 4B. |
|
|
| ## References |
|
|
| - Gabeur, V., Long, S., Peng, S., et al. *Image Generators are Generalist Vision Learners.* [arXiv:2604.20329](https://arxiv.org/abs/2604.20329) (2026). |
| - Verkruysse, W., Svaasand, L. O., Nelson, J. S. *Remote plethysmographic imaging using ambient light.* Optics Express (2008). |
| - Poh, M.-Z., McDuff, D. J., Picard, R. W. *Non-contact, automated cardiac pulse measurements using video imaging and blind source separation.* Optics Express (2010). |
| - Liu, X., Fromm, J., Patel, S., McDuff, D. *MTTS-CAN: Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement.* NeurIPS (2020). |
| - Black Forest Labs. *FLUX.2 Klein.* https://bfl.ai/models/flux-2-klein (2025). |
|
|