| Image Augmentation |
| ================== |
|
|
| Image Augmentation is a data augmentation method that generates more training data |
| from the existing training samples. Image Augmentation is especially useful in domains |
| where training data is limited or expensive to obtain like in biomedical applications. |
|
|
| .. image:: https://github.com/kornia/data/raw/main/girona_aug.png |
| :align: center |
|
|
| Learn more: `https://paperswithcode.com/task/image-registration <https://paperswithcode.com/task/image-augmentation>`_ |
|
|
| Kornia Augmentations |
| -------------------- |
|
|
| Kornia leverages differentiable and GPU image data augmentation through the module `kornia.augmentation <https://kornia.readthedocs.io/en/latest/augmentation.html>`_ |
| by implementing the functionality to be easily used with `torch.nn.Sequential <https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html?highlight=sequential#torch.nn.Sequential>`_ |
| and other advanced containers such as |
| :py:class:`~kornia.augmentation.container.AugmentationSequential`, |
| :py:class:`~kornia.augmentation.container.ImageSequential`, |
| :py:class:`~kornia.augmentation.container.PatchSequential` and |
| :py:class:`~kornia.augmentation.container.VideoSequential`. |
|
|
| Our augmentations package is highly inspired by torchvision augmentation APIs while our intention is to not replace it. |
| Kornia is a library that aligns better to OpenCV functionalities enforcing floating operators to guarantees a better precision |
| without any float -> uint8 conversions plus on device acceleration. |
|
|
| However, we provide the following guide to migrate kornia <-> torchvision. Please, checkout the `Colab: Kornia Playground <https://colab.research.google.com/drive/1T20UNAG4SdlE2n2wstuhiewve5Q81VpS#revisionId=0B4unZG1uMc-WR3NVeTBDcmRwN0NxcGNNVlUwUldPMVprb1dJPQ>`_. |
|
|
| .. code-block:: python |
|
|
| import kornia.augmentation as K |
| import torch.nn as nn |
|
|
| transform = nn.Sequential( |
| K.RandomAffine(360), |
| K.ColorJitter(0.2, 0.3, 0.2, 0.3) |
| ) |
|
|
|
|
| Best Practices 1: Image Augmentation |
| ++++++++++++++++++++++++++++++++++++ |
|
|
| Kornia augmentations provides simple on-device augmentation framework with the support of various syntax sugars |
| (e.g. return transformation matrix, inverse geometric transform). Therefore, we provide advanced augmentation |
| container :py:class:`~kornia.augmentation.container.AugmentationSequential` to ease the pain of building augmenation pipelines. This API would also provide predefined routines |
| for automating the processing of masks, bounding boxes, and keypoints. |
|
|
| .. code-block:: python |
|
|
| import kornia.augmentation as K |
|
|
| aug = K.AugmentationSequential( |
| K.ColorJitter(0.1, 0.1, 0.1, 0.1, p=1.0), |
| K.RandomAffine(360, [0.1, 0.1], [0.7, 1.2], [30., 50.], p=1.0), |
| K.RandomPerspective(0.5, p=1.0), |
| data_keys=["input", "bbox", "keypoints", "mask"], # Just to define the future input here. |
| return_transform=False, |
| same_on_batch=False, |
| ) |
| # forward the operation |
| out_tensors = aug(img_tensor, bbox, keypoints, mask) |
| # Inverse the operation |
| out_tensor_inv = aug.inverse(*out_tensor) |
|
|
| .. image:: https://discuss.pytorch.org/uploads/default/optimized/3X/2/4/24bb0f4520f547d3a321440293c1d44921ecadf8_2_690x119.jpeg |
|
|
| From left to right: the original image, the transformed image, and the inversed image. |
|
|
|
|
| Best Practices 2: Video Augmentation |
| ++++++++++++++++++++++++++++++++++++ |
|
|
| Video data is a special case of 3D volumetric data that contains both spatial and temporal information, which can be referred as 2.5D than 3D. |
| In most applications, augmenting video data requires a static temporal dimension to have the same augmentations are performed for each frame. |
| Thus, :py:class:`~kornia.augmentation.container.VideoSequential` can be used to do such trick as same as `nn.Sequential`. |
| Currently, :py:class:`~kornia.augmentation.container.VideoSequential` supports data format like :math:`(B, C, T, H, W)` and :math:`(B, T, C, H, W)`. |
|
|
| .. code-block:: python |
|
|
| import kornia.augmentation as K |
|
|
| transform = K.VideoSequential( |
| K.RandomAffine(360), |
| K.RandomGrayscale(p=0.5), |
| K.RandomAffine(p=0.5) |
| data_format="BCTHW", |
| same_on_frame=True |
| ) |
|
|
| .. image:: https://user-images.githubusercontent.com/17788259/101993516-4625ca80-3c89-11eb-843e-0b87dca6e2b8.png |
|
|
|
|
| Customization |
| +++++++++++++ |
|
|
| Kornia augmentation implementations have two additional parameters compare to TorchVision, |
| ``return_transform`` and ``same_on_batch``. The former provides the ability of undoing one geometry |
| transformation while the latter can be used to control the randomness for a batched transformation. |
| To enable those behaviour, you may simply set the flags to True. |
|
|
| .. code-block:: python |
|
|
| import kornia.augmentation as K |
|
|
| class MyAugmentationPipeline(nn.Module): |
| def __init__(self) -> None: |
| super(MyAugmentationPipeline, self).__init__() |
| self.aff = K.RandomAffine( |
| 360, return_transform=True, same_on_batch=True |
| ) |
| self.jit = K.ColorJitter(0.2, 0.3, 0.2, 0.3, same_on_batch=True) |
|
|
| def forward(self, input): |
| input, transform = self.aff(input) |
| input, transform = self.jit((input, transform)) |
| return input, transform |
|
|
| Example for semantic segmentation using low-level randomness control: |
|
|
| .. code-block:: python |
|
|
| import kornia.augmentation as K |
|
|
| class MyAugmentationPipeline(nn.Module): |
| def __init__(self) -> None: |
| super(MyAugmentationPipeline, self).__init__() |
| self.aff = K.RandomAffine(360) |
| self.jit = K.ColorJitter(0.2, 0.3, 0.2, 0.3) |
|
|
| def forward(self, input, mask): |
| assert input.shape == mask.shape, |
| f"Input shape should be consistent with mask shape, " |
| f"while got {input.shape}, {mask.shape}" |
|
|
| aff_params = self.aff.forward_parameters(input.shape) |
| input = self.aff(input, aff_params) |
| mask = self.aff(mask, aff_params) |
|
|
| jit_params = self.jit.forward_parameters(input.shape) |
| input = self.jit(input, jit_params) |
| mask = self.jit(mask, jit_params) |
| return input, mask |
|
|