Instructions to use Siddanna/transparent-tube-classifier with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Siddanna/transparent-tube-classifier with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-classification", model="Siddanna/transparent-tube-classifier") pipe("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png")# Load model directly from transformers import AutoImageProcessor, AutoModelForImageClassification processor = AutoImageProcessor.from_pretrained("Siddanna/transparent-tube-classifier") model = AutoModelForImageClassification.from_pretrained("Siddanna/transparent-tube-classifier") - Notebooks
- Google Colab
- Kaggle
| library_name: transformers | |
| license: apache-2.0 | |
| tags: | |
| - image-classification | |
| - dinov2 | |
| - vision | |
| - tube-classification | |
| - manufacturing | |
| datasets: | |
| - Siddanna/transparent-tube-dataset | |
| base_model: | |
| - facebook/dinov2-base | |
| pipeline_tag: image-classification | |
| # Transparent Tube Classifier | |
| A binary image classifier that distinguishes between: | |
| - **transparent_alone** π§ͺ β A transparent tube by itself | |
| - **transparent_with_blue** π§ͺπ β A transparent tube paired with a blue tube | |
| ## Model Details | |
| | Property | Value | | |
| |---|---| | |
| | **Base Model** | [facebook/dinov2-base](https://huggingface.co/facebook/dinov2-base) (ViT-B/14, 86.6M params) | | |
| | **Training Method** | Linear probe (frozen backbone + trained classifier head) | | |
| | **Training Dataset** | [Siddanna/transparent-tube-dataset](https://huggingface.co/datasets/Siddanna/transparent-tube-dataset) | | |
| | **Accuracy** | **100%** on test set | | |
| | **Loss** | 0.0014 | | |
| | **Image Size** | 256Γ256 (DINOv2 default) | | |
| | **License** | Apache 2.0 | | |
| ## Quick Start | |
| ### Using Pipeline (Easiest) | |
| ```python | |
| from transformers import pipeline | |
| classifier = pipeline("image-classification", model="Siddanna/transparent-tube-classifier") | |
| result = classifier("your_tube_image.jpg") | |
| print(result) | |
| # [{'label': 'transparent_with_blue', 'score': 0.99}, {'label': 'transparent_alone', 'score': 0.01}] | |
| ``` | |
| ### Manual Inference | |
| ```python | |
| from transformers import AutoImageProcessor, AutoModelForImageClassification | |
| from PIL import Image | |
| import torch | |
| # Load model and processor | |
| model = AutoModelForImageClassification.from_pretrained("Siddanna/transparent-tube-classifier") | |
| processor = AutoImageProcessor.from_pretrained("Siddanna/transparent-tube-classifier") | |
| # Load and classify image | |
| image = Image.open("your_tube_image.jpg") | |
| inputs = processor(image, return_tensors="pt") | |
| with torch.no_grad(): | |
| logits = model(**inputs).logits | |
| predicted_class = logits.argmax(-1).item() | |
| label = model.config.id2label[predicted_class] | |
| confidence = torch.softmax(logits, dim=-1)[0][predicted_class].item() | |
| print(f"Prediction: {label} (confidence: {confidence:.2%})") | |
| ``` | |
| ## Training Details | |
| ### Architecture | |
| - **Base**: DINOv2-base (Vision Transformer B/14), pretrained on LVD-142M (142M curated images) | |
| - **Head**: Linear classifier (768 β 2) | |
| - **Method**: Linear probe β backbone is frozen, only the classification head is trained | |
| - **Why DINOv2?**: DINOv2's global self-attention captures the full image context, which is critical for detecting whether a blue tube is present anywhere in the scene alongside the transparent tube | |
| ### Hyperparameters | |
| - Learning rate: `1e-3` (with cosine schedule) | |
| - Warmup steps: 50 | |
| - Batch size: 16 | |
| - Weight decay: 0.01 | |
| - Training epochs: 4 (converged at epoch 1) | |
| ### Data Augmentations | |
| - RandomResizedCrop (scale 0.7-1.0) | |
| - RandomHorizontalFlip | |
| - RandomRotation (Β±15Β°) | |
| - ColorJitter (brightness=0.3, contrast=0.3, saturation=0.2, hue=0.05) | |
| ### Training Curves | |
| | Epoch | Train Loss | Eval Loss | Eval Accuracy | | |
| |---|---|---|---| | |
| | 1 | 0.032 | 0.019 | **100%** | | |
| | 2 | 0.011 | 0.002 | **100%** | | |
| | 3 | 0.002 | 0.001 | **100%** | | |
| | 4 | 0.004 | 0.010 | 99.5% | | |
| ## For Production Use with Real Images | |
| The model is currently trained on **synthetic data**. For best results with your actual tubes: | |
| ### Step 1: Collect Real Photos | |
| Take 50-100+ photos per class of your actual tubes: | |
| ``` | |
| data/ | |
| βββ train/ | |
| β βββ transparent_alone/ # Photos of transparent tube alone | |
| β βββ transparent_with_blue/ # Photos of transparent + blue tube | |
| βββ test/ | |
| βββ transparent_alone/ | |
| βββ transparent_with_blue/ | |
| ``` | |
| ### Step 2: Re-train | |
| ```python | |
| # Clone the training script | |
| # Option A: Linear probe (fast, good with 50+ images/class) | |
| python train.py --data_dir ./data --freeze_backbone --hub_model_id your-username/tube-classifier | |
| # Option B: Full fine-tune (better with 200+ images/class) | |
| python train.py --data_dir ./data --learning_rate 5e-5 --hub_model_id your-username/tube-classifier | |
| ``` | |
| ### Tips for Collecting Good Training Data | |
| - **Vary backgrounds**: different surfaces, lighting conditions | |
| - **Vary angles**: slightly different camera positions | |
| - **Vary distances**: close-up and farther away shots | |
| - **Include edge cases**: partially occluded tubes, different orientations | |
| - **Match deployment conditions**: use the same camera/environment you'll deploy in | |
| ## Demo | |
| Try the model: [**Transparent Tube Classifier Demo**](https://huggingface.co/spaces/Siddanna/transparent-tube-classifier-demo) | |
| ## Citation | |
| ```bibtex | |
| @misc{transparent-tube-classifier, | |
| title={Transparent Tube Classifier}, | |
| author={Siddanna}, | |
| year={2024}, | |
| publisher={Hugging Face}, | |
| url={https://huggingface.co/Siddanna/transparent-tube-classifier} | |
| } | |
| ``` | |