My research interest is in the intersection of Natural Language Processing and Computer Vision. Particularly, I am interested in multi-modal learning, language grounding and low-resource machine learning. Recently I’ve also been exploring large scale machine learning systems in general, as part of my work at Chaldal!
Geospatial Language Model
Sept. 2023 ‑ Present 🟡
Working on creating a geospatial language mode that enhances the understanding of geo-entities in natural language, which in turn is able to help us create an address matching platform specially tuned for house addresses, at which platforms like Google Maps fails. This will help optimize delivery agents to deliver orders faster and more efficiently.
Collaborators:
- Tejas Viswanath (CTO, Chaldal Ltd.)
- Asif Imitial (SWE, Chaldal Ltd.)
🏷️ Natural Language Processing
LLMs
Retrieval Augmented Generation
3D Particle Picking
Oct. 2022 ‑ Present 🟡 Manuscript in preparation
Worked on developing a semi-supervised framework for 3D particle picking from macromolecular samples, which is able to detect particles directly from subtomogram.
Collaborators:
- Ajmain Yasar Ahmed Sahil (BUET)
Supervisors:
- Mostofa Rafid Uddin (Graduate Research Assistant, CMU)
- Dr. Min Xu (Associate Professor, CMU)
🏷️ Computer Vision
Object Detection
Image Processing
Low resource computer vision aided PD recognition
May 2022 ‑ Aug. 2023 🟢 Under review at MICCAI ‘24
Developed PULSAR
, an automated PD screening tool. It uses spatio-temporal graph neural network to detect PD from videos. We were also the first to explore positive unlabeled learning in this setting, addressing the lack of positive labels, which is generally the case in medical data. [Lead Investigator]
Collaborators:
- Md. Saiful Islam (Graduate Research Assistant, University of Rochester)
Supervisors:
- Dr. Mohammad Saifur Rahman (Professor, BUET)
- Dr. Ehsan Hoque (Associate Professor, University of Rochester)
🏷️ Computer Vision
AI for Healthcare
🔗 Project page
Quartet Fiduccia-Mattheyses Revisited for Larger Phylogenetic Studies
Nov. 2022 ‑ April 2023 🟢 Accepted at Bioinformatics (Oxford University Press) Journal!
Our proposed method QFM-FI, a faster and improved version of the QFM algorithm, can amalgamate millions of quartets over thousands of taxa into a species tree with a great level of accuracy. This also achieves a speedup of 20,000x compared to its predecessor. Worked on providing a theoretical analysis of the running time and memory requirements of QFM-FI.
Collaborators:
- Sharmin Akter Mim (BUET)
Supervisors:
- Dr. Mohammad Saifur Rahman (Professor, BUET)
- Dr. Md Shamsuzzoha Bayzid (Associate Professor, BUET)
- Dr. Rezwana Reaz (Assistant Professor, BUET)
🏷️ Bioinformatics
Computational Biology
Other Research and Independent Exploration
Personalized Recommendation System
June 2023 ‑ Present 🟡
Developing an in-house personalized recommender system from the ground up for the leading online platform (Chaldal.com ) for grocery and delivery in Bangladesh, scaling it to serve more than one million users.
Collaborators:
- Anand Bhaskar (Research Scientist, Meta)
- Rohit Vaz (VP of Engineering, Chaldal Ltd.)
- Asif Imitial (SWE, Chaldal Ltd.)
🏷️ Recommender Systems
Reinforcement Learning
Bangla Plagiarism Detection
Jan. 2023 ‑ Mar. 2023 🟢
We created the first Bangla Plagiarism Dataset (Available in Hugging Face 🤗 ) using a semi-supervised approah as part of our Machine Learning project. Along with that, we proposed two distinct approaches for detecting plagiarism. The first method involves fine-tuning Bangla BERT, while the second method utilizes sentence embeddings for multi-document plagiarism detection. The software will soon be available. A presentation on the project is available here .
Collaborators:
- Ramisa Alam (BUET)
Supervisors:
- Md. Tareq Mahmood (Assistant Professor, BUET)
🏷️ Natural Language Processing
Rlagiarism Detection
🔗 Project page
Occam’s Razor Strikes Again: Revisiting Short-Text Stream Clustering with Latest Sentence Embeddings
May 2021 ‑ May 2022 🟢
During the height of COVID-19, online classes were the new normal. The rise of MOOCs led to a very interesting research question - Given a large number of questions already posed by the students, how can we detect repetitive questions in real time and unify them as a single question for the instructor? Under the supervision of Prof. Subhra Kanti Karmakar, I worked on this problem of online short-text clustering. We developed a software tool called One Pass Sentence Embedding Clustering. This tool efficiently clusters short text streams using a unique one-pass algorithm that calculates similarity scores based on sentence embeddings. It was used as an internal tool to cluster questions in online classes at Auburn.
Collaborators:
- Minh Smith (Auburn University)
- Ramisa Alam (BUET)
Supervisors:
- Dr. Shubhra Kanti Karmaker (Assistant Professor, Auburn University)
- Dr. Anindya Iqbal (Professor, BUET)
🏷️ Natural Language Processing
Clustering
🔗 Project page
Bangla Sign Language Recognition
Oct. 2022 ‑ Feb. 2023 🟢
Sign language, as a different form of communication language, is important to large groups of people in society. Each sign in sign language is unique due to variations in hand form, motion profile, and positioning of the hand, face, and other body components. So, visual sign language recognition is a complex research area in computer vision. In this work, we present a new word-level Bangla Sign Language (BdSL) dataset consisting of 611 videos over 40 BdSL words, along with two different approaches: one with a 3D Convolutional Neural Network model and another a novel Graph Neural Network based approach for the classification of BdSL40 dataset. To the best of our knowledge, this is the first study on word-level BdSL recognition, and the dataset was transcribed from Indian Sign Language (ISL) using the Bangla Sign Language Dictionary (1997). The proposed GNN model achieved an F1 score of 89%. The study highlights the significant lexical and semantic similarity between BdSL, West Bengal Sign Language, and ISL, and the lack of word-level datasets for BdSLin the literature.
Supervised 4 junior year undergraduates in this project!
Collaborators:
- H.A.Z. Sameen Shahgir (BUET)
- Khondker Salman Sayeed (BUET)
- Md Toki Tahmid (BUET)
- Tanjeem Azwad Zaman (BUET)
🏷️ Computer Vision
Sign Language Recognition
🔗 Dataset