BAMBI - UAV Wildlife Detection Model

The BAMBI project uses camera drones together with artificial intelligence to automatically monitor wildlife.

Dataset

Between April 2022 and March 2025, we assembled a comprehensive airborne wildlife dataset, comprised of RGB and, most notably, thermal video data. The dataset encompasses over 400 recorded flights, each capturing a diverse array of mammalian species. The videos at hand have been undistorted based on the assessed camera intrinsic parameters. Each frame in the dataset is meticulously labeled, providing rich ground-truth annotations for further analysis.

A substantial portion of the dataset is dedicated to red deer (Cervus elaphus) and wild boar (Sus scrofa), which together account for the majority of annotated instances. Fallow deer (Dama dama) represent the third most frequently labeled species, while roe deer (Capreolus capreolus) comprise the fourth largest group. In addition to these dominant classes, the dataset includes a range of other mammals like chamois (Rupicapra rupicapra), Alpine ibex (Capra ibex) or wolves (Canis lupus), which have been recorded in diverse wild animal gates, but also in animal parks with near-natural structured enclosures.

Building upon this foundation, we curated a specialized object detection subset using 225 of our videos, which consists of three ecologically significant species in Austria -- red deer, wild boar, and roe deer. As in the given use case, we are not focusing on the classification of the species; they are summarized as one "animal" label. This subset serves as the basis for training an object detection model and consists of 19,252 thermal video frames -- separated in a train set with 15,730 (~~80%), a validation set with 1,696 (~~10%) and a test set with 1,826 (~10%) images. The images are split based on individual videos and flight locations to avoid biases related to environmental or flight characteristics.

In addition, we derived a tailored Version for the task of individual animal Re-ID, enabling more advanced and longitudinal monitoring of wildlife and consists of 2,175 individual animals (1,725 in train; 225 in test dataset consisting of query and gallery examples), each from multiple views.

This dataset is hosted on Zenodo.

Model

We trained a wildlife-specific thermal Re-ID model based on the Omni-Scale Network architecture. The model was trained on the aforementioned thermal data subset tailored for re-identification tasks, enabling it to capture domain-specific appearance characteristics of wildlife in thermal imagery. Like this, the Re-ID model achieves a Rank-1 accuracy of 94.2%, Rank-5 of 97.0%, Rank-10 of 98.3%, and Rank-20 of 98.7%, demonstrating its strong discriminative capability in challenging field conditions, as long as major parts of the specific animals are visible.


import torch

import torch.nn.functional as F

from torchvision import transforms

from PIL import Image

import argparse

import os



# --------------------------------------------------

# Configuration

# --------------------------------------------------

IMAGE_SIZE = (128, 128)

MEAN = [0.485, 0.456, 0.406]

STD = [0.229, 0.224, 0.225]





# --------------------------------------------------

# Image preprocessing (must match training!)

# --------------------------------------------------

transform = transforms.Compose([

  transforms.Resize(IMAGE_SIZE),
  
  transforms.ToTensor(),
  
  transforms.Normalize(mean=MEAN, std=STD),

])





# --------------------------------------------------

# Load ReID model

# --------------------------------------------------

def load_model(model_path, device):

  """
  
  Loads a trained Omni-Scale ReID model (.pt)
  
  """
  
  model = torch.load(model_path, map_location=device)
  
  
  
  model.to(device)
  
  model.eval()
  
  return model





# --------------------------------------------------

# Extract embedding

# --------------------------------------------------

def extract_embedding(model, image_path, device):

  """
  
  Extracts a normalized ReID embedding from an image
  
  """
  
  if not os.path.exists(image_path):
  
      raise FileNotFoundError(image_path)
  
  
  
  img = Image.open(image_path).convert("RGB")
  
  img = transform(img)
  
  img = img.unsqueeze(0).to(device)  # [1, 3, 128, 128]
  
  
  
  with torch.no_grad():
  
      feat = model(img)
  
  
  
  # Flatten and L2-normalize (standard ReID practice)
  
  feat = feat.squeeze(0)
  
  feat = F.normalize(feat, dim=0)
  
  
  
  return feat.cpu()





# --------------------------------------------------

# Cosine similarity

# --------------------------------------------------

def cosine_sim(feat1, feat2):

  return F.cosine_similarity(
  
      feat1.unsqueeze(0),
  
      feat2.unsqueeze(0)
  
  ).item()





# --------------------------------------------------

# Main

# --------------------------------------------------

def main():

  parser = argparse.ArgumentParser("Omni-Scale ReID Inference")
  
  parser.add_argument("--model", type=str, required=True, help="Path to .pt model")
  
  parser.add_argument("--img1", type=str, required=True, help="Query image")
  
  parser.add_argument("--img2", type=str, default=None, help="Gallery image (optional)")
  
  args = parser.parse_args()
  
  
  
  device = "cuda" if torch.cuda.is_available() else "cpu"
  
  print(f"Using device: {device}")
  
  
  
  # Load model
  
  model = load_model(args.model, device)
  
  
  
  # Extract embedding(s)
  
  feat1 = extract_embedding(model, args.img1, device)
  
  print(f"Embedding shape: {feat1.shape}")
  
  
  
  if args.img2:
  
      feat2 = extract_embedding(model, args.img2, device)
  
      sim = cosine_sim(feat1, feat2)
  
      print(f"Cosine similarity: {sim:.4f}")





if __name__ == "__main__":

  main()

License

license: agpl-3.0

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support