Exploring latent feature representations of image pixels via convolutional neural network to enhance food segmentation | Zendy

Ying Dai | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Exploring latent feature representations of image pixels via convolutional neural network to enhance food segmentation

Author(s) -

Ying Dai

Publication year - 2025

Publication title -

ieee access

Language(s) - English

Resource type - Magazines

SCImago Journal Rank - 0.587

H-Index - 127

eISSN - 2169-3536

DOI - 10.1109/access.2025.3612465

Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation

For open vocabulary recognition of ingredients in food images, segmenting the ingredients is a crucial step. This paper proposes a novel approach that explores latent feature representations of image pixels via a convolutional neural network (CNN) to enhance ingredient segmentation. An internal clustering metric based on the silhouette score is defined to evaluate the clustering quality of various pixel-level feature representations generated by different feature maps derived from various CNN backbones. Using this metric, the paper explores the most suitable feature representations and clustering methods for ingredient segmentation. Additionally, it is found that principle component (PC)-driven latent feature presentations of pixels derived from concatenations of backbone feature maps improve the clustering quality of them, resulting in stable segmentation outcomes. Notably, the number of selected eigenvalues can be used as the number of clusters to achieve good segmentation results. The proposed method is validated on widely used public datasets, achieving state-of-the-art performances. Importantly, the proposed segmentation method is unsupervised, and pixel-level feature representations from backbones are not fine-tuned on specific datasets. This demonstrates the flexibility, generalizability, and interpretability of the proposed method, while reducing the need for extensive labeled datasets.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research