AIJonas
/

nutrition5k-pretrained-efficientnetv2b0-rgb-plus-depth-carb

computer-vision

carbohydrate-estimation

Model card Files Files and versions

EfficientNetV2B0 RGB + CNN Depth Carbohydrate Regression

This model predicts dish-level carbohydrate content from overhead RGB images and overhead depth images from the Nutrition5K dataset.

Architecture

Dual-input multimodal regression model
One pretrained EfficientNetV2B0 branch for overhead RGB images
One from-scratch CNN branch for overhead depth images
Global average pooling on both branches
Feature fusion through concatenation
Fully connected regression head
Final dense layer with linear activation for carbohydrate prediction

Backbone setup

EfficientNetV2B0 is initialized with ImageNet pretrained weights for RGB
The RGB backbone is frozen during the initial training stage
The depth branch is trained from scratch

Input modalities

rgb_input: overhead RGB image
depth_input: overhead depth image

Target

total_carb

Downloads last month: 94

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support