Hybrid convolutional vision transformer for extrusion-based 3D food-printing defect classification

Mawardi, Cholid and Buono, Agus and Priandana, Karlisa and Herianto, Herianto (2025) Hybrid convolutional vision transformer for extrusion-based 3D food-printing defect classification. IAES International Journal of Artificial Intelligence (IJ-AI), 14 (4). pp. 3311-3323. ISSN 2252-8938

[thumbnail of Hybrid convolutional vision transformer for extrusion-based 3D food-printing defect classification] Text (Hybrid convolutional vision transformer for extrusion-based 3D food-printing defect classification)
27603-61620-2-PB (4).pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (908kB)
Official URL: http://ijai.iaescore.com

Abstract

Deep learning is generally used to perform remote monitoring of threedimensional (3D) printing results, including extrusion-based 3D food printing. One of the widely used deep learning algorithms for defect detection in 3D printing is the convolutional neural network (CNN). However, the process requires high computational costs and a large dataset. This research proposes the Con4ViT model, a hybrid model that combines the strengths of vision transformer with the inherent feature extraction capabilities of CNN. The locally extracted features in the CNN were merged using the transformers’ global features with four transformer encoder blocks. The proposed model has a smaller number of parameters compared to other lightweight pre-trained deep learning models such as VGG16, VGG19, EfficientNetB2, InceptionV3, and ResNet50. Thus, the proposed model is simplified. Simulations were conducted to classify defect and non-defect images obtained from the printing results of a developed extrusion-based 3D food printing device. Simulation results showed that the model produced an accuracy of 95.43%, higher than the state-of-the-art techniques, i.e., VGG16, VGG19, MobileNetV2, EfficientNetB2, InceptionV3, and ResNet50, with accuracies of 77.88%, 86.30%, 82.95%, 90.87%, 84.62%, and 93.83%, respectively. This research shows that the proposed Con4ViT model can be used for 3D food printing defect detection with high accuracy.

Item Type: Article
Uncontrolled Keywords: 3D food printing, Convolutional neural network, Hybrid convolutional, Image classification, Vision transformer
Subjects: 000 - Komputer, Informasi dan Referensi Umum > 000 Ilmu komputer, ilmu pengetahuan dan sistem-sistem > 000 Ilmu komputer, informasi dan pekerjaan umum
600 – Teknologi (Ilmu Terapan) > 600 Teknologi (ilmu terapan) > 600 Teknologi
600 – Teknologi (Ilmu Terapan) > 600 Teknologi (ilmu terapan) > 602 Aneka ragam tentang teknologi dan ilmu terapan
600 – Teknologi (Ilmu Terapan) > 620 Ilmu teknik dan ilmu yang berkaitan > 629 Cabang teknik lainnya
Depositing User: Perpustakaan Polimedia
Date Deposited: 26 Jan 2026 02:35
Last Modified: 30 Mar 2026 02:57
URI: https://repository.polimedia.ac.id/id/eprint/3868

Actions (login required)

View Item
View Item