--- tags: - ml-intern --- # 籽粒分类模型(大米品种分类) / Grain Seed Classification Model This repository contains the training script and (eventually) the fine-tuned model for classifying **rice grain varieties** from seed images. ## Dataset - **Source:** [`nateraw/rice-image-dataset`](https://huggingface.co/datasets/nateraw/rice-image-dataset) - **Size:** 75,000 RGB images (250×250) - **Classes (5):** 1. Arborio 2. Basmati 3. Ipsala 4. Jasmine 5. Karacadag - **License:** CC0-1.0 ## Model - **Architecture:** ResNet-18 (`microsoft/resnet-18`) — ~11M parameters, lightweight and fast - **Task:** Multi-class image classification ## How to train Run the provided script on a GPU (e.g. a10g-large or t4-small via Hugging Face Jobs, or Google Colab): ```bash pip install transformers datasets torch accelerate evaluate pillow trackio export HF_MODEL_REPO=chaosbee997/rice-seed-classifier export HF_TOKEN=your_huggingface_token python train.py ``` Or submit via Hugging Face Jobs (requires GPU credits): ```bash huggingface-cli job run \ --script train.py \ --hardware a10g-large \ --timeout 4h \ --dependencies "transformers,datasets,torch,accelerate,evaluate,pillow,trackio" ``` ## Expected results - Typical fine-tuning on this dataset with ResNet-18 yields **> 95% accuracy** within 3-5 epochs. ## Extending to other crops The same script works for any `datasets.ImageFolder`-style dataset. To add peanut, corn, wheat, etc.: 1. Collect or find an image dataset with folder-per-class structure. 2. Upload it to Hugging Face Hub or point `load_dataset` to a local path. 3. Update `MODEL_NAME` if you want a different backbone (e.g. `microsoft/resnet-34`, `google/mobilenet_v2_1.0_224`). 4. Run `train.py`. ## License Apache-2.0 ## Generated by ML Intern This model repository was generated by [ML Intern](https://github.com/huggingface/ml-intern), an agent for machine learning research and development on the Hugging Face Hub. - Try ML Intern: https://smolagents-ml-intern.hf.space - Source code: https://github.com/huggingface/ml-intern ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_id = "chaosbee997/rice-seed-classifier" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id) ``` For non-causal architectures, replace `AutoModelForCausalLM` with the appropriate `AutoModel` class.