← Back to Publications List

BRRD-Net: A Transformer-Based Framework for Region-Specific Romanized Bangla Dialect Detection Using Pretrained Embeddings and Prototype Learning

Students & Supervisors

Student Authors
Hasin Almas Sifat
Bachelor of Science in Computer Science & Engineering, FST
Koushik Biswas Arko
Bachelor of Science in Computer Science & Engineering, FST
Soumodip Madhu
Bachelor of Science in Computer Science & Engineering, FST
Shariar Hasan Sifat
Bachelor of Science in Computer Science & Engineering, FST
Supervisors
Sharfuddin Mahmood
Assistant Professor, Special Assistant [osa], FST

Abstract

This study introduces BRRD-Net (Bangla Region-based Romanized Dialect-Net), which is a totally new pedagogical framework based on Transformer to automatically identifying regional dialects of Romanized Bangla (Banglish) text. The framework utilizes multilingual contextual embeddings settings from XLM-RoBERTa and parameter-efficient fine-tuning via Low-Rank Adaptation (LoRA), along with a Prototypical Classification layer, to robustly identify minute phonological, lexical and orthographic differences between five of the most prominent Bengali dialects. To make the model robust to noisy text generated by users, we have developed a systematic preprocessing pipeline, which includes noise-aware data augmentation techniques, including phonetic normalizations, character-level perturbation, and code-mixing English words. Evaluations conducted on the BODD dataset show that BRRD-net achieves 81.60% accuracy and macro-F1 = 0.816 and is able to outpace strong transformer baselines, such as, mBERT, RoBERTa and DistilBERT. Additionally, examination of confusion matrix further demonstrates distinct class separation which indicates an area of challenge given how overlapping the dialect pairs are lexically. BRRD-net provides an interpretable and scalable framework for low-resource identifying dialects of Romanized languages.

Keywords

Romanized Bangla Banglish Dialect Identification Transformers XLM-RoBERTa Low-Rank Adaptation LoRA) Prototypical Networks Data Augmentation

Publication Details

  • Type of Publication:
  • Conference Name: International Conference on Electrical, Computer Telecommunication Engineering (ICECTE 2026)
  • Date of Conference: 29/01/2026 - 29/01/2026
  • Venue: RUET, Rajshahi
  • Organizer: Faculty of ECE, Ruet