Instructions to use Glanty/Capybara with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use Glanty/Capybara with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("Glanty/Capybara", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
update README.md
Browse files
README.md
CHANGED
|
@@ -47,7 +47,7 @@ The framework leverages advanced diffusion models and transformer architectures
|
|
| 47 |
- [ ] Release our unified creation model.
|
| 48 |
- [ ] Release training code.
|
| 49 |
|
| 50 |
-
## 🏞️ Show
|
| 51 |
**Results of generation tasks.** We show two generation tasks under our unified model. The top section presents text-to-image results, illustrating high-fidelity synthesis across diverse styles. The bottom rows show text-to-video results, demonstrating temporally coherent generation with natural motion for both realistic and stylized content.
|
| 52 |
<p align="center">
|
| 53 |
<img src="./assets/misc/gen_teaser.png" style="width: 100%; height: auto;"/>
|
|
@@ -63,7 +63,7 @@ The framework leverages advanced diffusion models and transformer architectures
|
|
| 63 |
<img src="./assets/misc/videoedit_teaser5.png" style="width: 100%; height: auto;"/>
|
| 64 |
</p>
|
| 65 |
|
| 66 |
-
**Results of in-context visual creation.** We show in-context generation and in-context editing results
|
| 67 |
<p align="center">
|
| 68 |
<img src="./assets/misc/incontext_teaser2.png" style="width: 100%; height: auto;"/>
|
| 69 |
</p>
|
|
@@ -212,8 +212,8 @@ For editing tasks (TI2I / TV2V), prepare a CSV with `img_path`/`video_path` and
|
|
| 212 |
|
| 213 |
```csv
|
| 214 |
img_path,instruction
|
| 215 |
-
img1.jpeg,
|
| 216 |
-
img2.jpeg,
|
| 217 |
```
|
| 218 |
|
| 219 |
> The path column holds relative paths to media files (images or videos) under the data root directory.
|
|
@@ -268,7 +268,7 @@ ln -s /path/to/Capybara /path/to/ComfyUI/custom_nodes/Capybara
|
|
| 268 |
| **Capybara Load Rewrite Model** | Load Qwen3-VL for prompt rewriting |
|
| 269 |
| **Capybara Rewrite Instruction** | Expand short prompts into detailed instructions |
|
| 270 |
|
| 271 |
-
A sample workflow is provided in [`comfyui/examples/`](comfyui/examples
|
| 272 |
|
| 273 |
## ⚙️ Configuration Details
|
| 274 |
|
|
|
|
| 47 |
- [ ] Release our unified creation model.
|
| 48 |
- [ ] Release training code.
|
| 49 |
|
| 50 |
+
## 🏞️ Show Cases
|
| 51 |
**Results of generation tasks.** We show two generation tasks under our unified model. The top section presents text-to-image results, illustrating high-fidelity synthesis across diverse styles. The bottom rows show text-to-video results, demonstrating temporally coherent generation with natural motion for both realistic and stylized content.
|
| 52 |
<p align="center">
|
| 53 |
<img src="./assets/misc/gen_teaser.png" style="width: 100%; height: auto;"/>
|
|
|
|
| 63 |
<img src="./assets/misc/videoedit_teaser5.png" style="width: 100%; height: auto;"/>
|
| 64 |
</p>
|
| 65 |
|
| 66 |
+
**Results of in-context visual creation.** We show in-context generation and in-context editing results, including subject-conditioned generation (S2V/S2I), conditional generation (C2V), image-to-video (I2V), reference-driven editing (II2I/IV2V).
|
| 67 |
<p align="center">
|
| 68 |
<img src="./assets/misc/incontext_teaser2.png" style="width: 100%; height: auto;"/>
|
| 69 |
</p>
|
|
|
|
| 212 |
|
| 213 |
```csv
|
| 214 |
img_path,instruction
|
| 215 |
+
img1.jpeg,instruction1.
|
| 216 |
+
img2.jpeg,instruction2.
|
| 217 |
```
|
| 218 |
|
| 219 |
> The path column holds relative paths to media files (images or videos) under the data root directory.
|
|
|
|
| 268 |
| **Capybara Load Rewrite Model** | Load Qwen3-VL for prompt rewriting |
|
| 269 |
| **Capybara Rewrite Instruction** | Expand short prompts into detailed instructions |
|
| 270 |
|
| 271 |
+
A sample workflow is provided in [`comfyui/examples/`](https://github.com/xgen-universe/Capybara/tree/main/comfyui/examples). For setup details and node documentation, see the [ComfyUI README](https://github.com/xgen-universe/Capybara/tree/main/comfyui).
|
| 272 |
|
| 273 |
## ⚙️ Configuration Details
|
| 274 |
|