Enhancing Image Captioning with a Multi-Encoder Ensemble Framework

← Back to Publications List

Enhancing Image Captioning with a Multi-Encoder Ensemble Framework

Students & Supervisors

Student Authors

Tausif Fardin Sinha

Master of Science in Computer Science, FST

Suvo Saha

Bachelor of Science in Computer Science & Engineering, FST

Tithi Biswas

Bachelor of Science in Computer Science & Engineering, FST

Supervisors

Dr. Abdus Salam

Associate Professor, Faculty, FST

Abstract

Image captioning has traditionally relied on encoder-decoder architectures such as CNN-LSTM and, more recently, Transformer-based models. Even though these methods have shown promise, single architectures frequently fall short, resulting in captions that are either biased toward dominant patterns or fluent but lack semantic depth. We propose an ensemble framework that combines Transformer decoders with the complementary advantages of several CNN encoders, such as ResNet-101, InceptionV3, and EfficientNetB3, to overcome these constraints. Hard voting (n-gram frequency) and soft voting (token probability averaging) are used to refine the multiple candidate captions generated by the system for each image during inference. This design uses a variety of visual representations while keeping the strong contextual modeling of Transformers. The evaluations performed on the Flickr8k and Flickr30k datasets demonstrate that our ensemble model always performs better than individual models on the BLEU, METEOR, ROUGE, and CIDEr metrics. The captions of our model are not only more accurate, but also more coherent and descriptive.

Keywords

Image Captioning Ensemble Learning Encoder–Decoder Architecture Convolutional Neural Network Transformer Decoders

Publication Details

Type of Publication:
Conference Name: 3rd International Conference on Big Data, IoT and Machine Learning (BIM 2025)
Date of Conference: 25/09/2025 - 25/09/2025
Venue: Dhaka International University, Bangladesh
Organizer: Dhaka International University, Bangladesh Computer Society

View on Publisher

Faculties

Information

Partnerships

Institutes

Accreditations

American
International
University-
Bangladesh

where leaders are created