Instructions to use Siddanna/transparent-tube-classifier with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Siddanna/transparent-tube-classifier with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-classification", model="Siddanna/transparent-tube-classifier") pipe("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png")# Load model directly from transformers import AutoImageProcessor, AutoModelForImageClassification processor = AutoImageProcessor.from_pretrained("Siddanna/transparent-tube-classifier") model = AutoModelForImageClassification.from_pretrained("Siddanna/transparent-tube-classifier") - Notebooks
- Google Colab
- Kaggle
metadata
library_name: transformers
license: apache-2.0
tags:
- image-classification
- dinov2
- vision
- tube-classification
- manufacturing
datasets:
- Siddanna/transparent-tube-dataset
base_model:
- facebook/dinov2-base
pipeline_tag: image-classification
Transparent Tube Classifier
A binary image classifier that distinguishes between:
- transparent_alone π§ͺ β A transparent tube by itself
- transparent_with_blue π§ͺπ β A transparent tube paired with a blue tube
Model Details
| Property | Value |
|---|---|
| Base Model | facebook/dinov2-base (ViT-B/14, 86.6M params) |
| Training Method | Linear probe (frozen backbone + trained classifier head) |
| Training Dataset | Siddanna/transparent-tube-dataset |
| Accuracy | 100% on test set |
| Loss | 0.0014 |
| Image Size | 256Γ256 (DINOv2 default) |
| License | Apache 2.0 |
Quick Start
Using Pipeline (Easiest)
from transformers import pipeline
classifier = pipeline("image-classification", model="Siddanna/transparent-tube-classifier")
result = classifier("your_tube_image.jpg")
print(result)
# [{'label': 'transparent_with_blue', 'score': 0.99}, {'label': 'transparent_alone', 'score': 0.01}]
Manual Inference
from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image
import torch
# Load model and processor
model = AutoModelForImageClassification.from_pretrained("Siddanna/transparent-tube-classifier")
processor = AutoImageProcessor.from_pretrained("Siddanna/transparent-tube-classifier")
# Load and classify image
image = Image.open("your_tube_image.jpg")
inputs = processor(image, return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
predicted_class = logits.argmax(-1).item()
label = model.config.id2label[predicted_class]
confidence = torch.softmax(logits, dim=-1)[0][predicted_class].item()
print(f"Prediction: {label} (confidence: {confidence:.2%})")
Training Details
Architecture
- Base: DINOv2-base (Vision Transformer B/14), pretrained on LVD-142M (142M curated images)
- Head: Linear classifier (768 β 2)
- Method: Linear probe β backbone is frozen, only the classification head is trained
- Why DINOv2?: DINOv2's global self-attention captures the full image context, which is critical for detecting whether a blue tube is present anywhere in the scene alongside the transparent tube
Hyperparameters
- Learning rate:
1e-3(with cosine schedule) - Warmup steps: 50
- Batch size: 16
- Weight decay: 0.01
- Training epochs: 4 (converged at epoch 1)
Data Augmentations
- RandomResizedCrop (scale 0.7-1.0)
- RandomHorizontalFlip
- RandomRotation (Β±15Β°)
- ColorJitter (brightness=0.3, contrast=0.3, saturation=0.2, hue=0.05)
Training Curves
| Epoch | Train Loss | Eval Loss | Eval Accuracy |
|---|---|---|---|
| 1 | 0.032 | 0.019 | 100% |
| 2 | 0.011 | 0.002 | 100% |
| 3 | 0.002 | 0.001 | 100% |
| 4 | 0.004 | 0.010 | 99.5% |
For Production Use with Real Images
The model is currently trained on synthetic data. For best results with your actual tubes:
Step 1: Collect Real Photos
Take 50-100+ photos per class of your actual tubes:
data/
βββ train/
β βββ transparent_alone/ # Photos of transparent tube alone
β βββ transparent_with_blue/ # Photos of transparent + blue tube
βββ test/
βββ transparent_alone/
βββ transparent_with_blue/
Step 2: Re-train
# Clone the training script
# Option A: Linear probe (fast, good with 50+ images/class)
python train.py --data_dir ./data --freeze_backbone --hub_model_id your-username/tube-classifier
# Option B: Full fine-tune (better with 200+ images/class)
python train.py --data_dir ./data --learning_rate 5e-5 --hub_model_id your-username/tube-classifier
Tips for Collecting Good Training Data
- Vary backgrounds: different surfaces, lighting conditions
- Vary angles: slightly different camera positions
- Vary distances: close-up and farther away shots
- Include edge cases: partially occluded tubes, different orientations
- Match deployment conditions: use the same camera/environment you'll deploy in
Demo
Try the model: Transparent Tube Classifier Demo
Citation
@misc{transparent-tube-classifier,
title={Transparent Tube Classifier},
author={Siddanna},
year={2024},
publisher={Hugging Face},
url={https://huggingface.co/Siddanna/transparent-tube-classifier}
}