Researching state-of-the-art AI methods in NLP and Multimodal Large Language Models at Mohamed bin Zayed University of Artificial Intelligence (MBZUAI)
Starting PhD at MBZUAI under the guidance of Professor Timothy Baldwin and Dr. Yova Kementchedjhieva, focusing on uncertainty in multimodal learning and vision-language models (VLMs).
As a Computer Science graduate from Ho Chi Minh City University of Technology – Vietnam National University (HCMUT-VNU), I completed my degree in an accelerated 3.5 years and will soon join the prestigious Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) as a PhD student in Natural Language Processing (NLP).
Throughout my undergraduate journey under the guidance of Assoc. Prof. Dr. Tho T. Quan, I developed strong programming proficiency in Python, Java, and C++, alongside a solid foundation in Machine Learning, Deep Learning, and Multimodal Analysis. My academic coursework and independent research cultivated my passion for state-of-the-art AI methods, particularly in NLP and Multimodal Large Language Models (MLLMs).
MBZUAI
PhD in NLP at MBZUAI (2025-Present)
BSc in Computer Science at HCMUT-VNU (2021-2024)
Python, C++, Java, Machine Learning, Deep Learning
NLP, Multimodal AI, Large Language Models
Vietnamese (Native), English (IELTS 7.5)
Vision-language models for text and image understanding, interpretation, and alignment.
Methods for quantifying model confidence and reliability, including calibration and uncertainty-aware inference in language and multimodal systems.
Foundation models and methods to make multimodal learning more efficient.
AAAI Press
Introduces a structured hackathon-based framework to teach high school students AI agent design through lectures, guided projects, and starter code. The approach bridges AI literacy and practical agent deployment while fostering collaboration, creativity, and responsible innovation.
Springer
Presents an LLM-powered voice and text irrigation assistant for Vietnamese agriculture. The system reached 82% slang understanding, 90% intent-to-command translation accuracy, and low-latency operation across function calls, database queries, and sensor reads.
Springer
Proposes LumbarCLIP, a multimodal contrastive framework aligning lumbar MRI scans with radiology reports. The model achieves up to 95.00% accuracy and 94.75% F1-score, with ablations showing linear projection heads improve cross-modal alignment.
Springer
Introduces a unified pipeline that converts unstructured data into queryable databases using prompt optimization, LLM extraction, and cloud integration. Across multiple domains, the framework achieved a mean F1-score of 93.6% with GPT-4o mini.
Springer
Presents a VITS-based Vietnamese TTS and voice conversion system for low-resource speaker adaptation. It delivers strong subjective quality, including MOS 4.31 ± 0.158 for comprehensibility and 3.68 ± 0.135 for naturalness using only 5 minutes of extra training data.
ICICCT
Develops a four-module framework for customer feedback analytics and validates it on Vietnamese real-world business data. Claude 3.5 Sonnet achieved 87.35% sentiment classification accuracy and 82.21% department identification accuracy.
SOICT
Introduces TI-JEPA, an energy-based joint embedding pretraining strategy for text-image alignment using cross-attention. The method delivered competitive accuracy and F1 on multimodal sentiment analysis benchmarks. GitHub: nhatkhangcs/tijepa.
AAAI Press
Addresses Vietnamese-to-Bahnar translation in an extreme low-resource setting using transfer learning and targeted augmentation. The approach improves translation quality over baselines and supports preservation and accessibility of the Bahnaric language. GitHub: nhatkhangcs/BARTViBa.
Text-Image Joint-Embedding Predictive Architecture for multimodal fusion. Achieved 9.9/10 in Capstone Project Defense.
AI-powered healthcare application with agentic AI capabilities, integrating Google APIs and Redis for intelligent health management.
Advanced AI system showcasing cutting-edge machine learning and artificial intelligence implementations.
Innovative data generation system for creating synthetic datasets to enhance machine learning model training.
Multimodal framework based on CLIP adapted for medical healthcare, particularly on Low Back Pain (LBP) diagnosis.
Special Vietnamese tokenizer that breaks words into 5 components for enhanced NLP processing.
Multimodal CLIP model for Vietnamese text and image processing, enhancing performance for Vietnamese-specific tasks.
Advanced chatbot leveraging large language models and retrieval-augmented generation for university academic admission.
Open-source learning materials for Machine Learning and Natural Language Processing
Comprehensive Natural Language Processing laboratory materials and exercises, revised and enhanced for HCMUT students.
Practical machine learning laboratory exercises and implementations designed for HCMUT Computer Science students.
Featured in news articles, videos, and academic databases
Thanh Niên Newspaper
Featured for graduating early with a near-perfect thesis score from Bach Khoa International Program.
Read ArticleZnews.vn
Featured for developing v7, an AI-powered Vietnamese input method that revolutionizes typing experience.
Read ArticleOISP HCMUT
Recognized for winning the prestigious AmCham Scholarship 2024.
Read ArticleOISP HCMUT International
Featured for the remarkable journey from barely qualifying to becoming a top talented student.
Read ArticleOISP HCMUT
Featured for choosing to study at Bach Khoa over study abroad opportunities in New Zealand.
Read ArticleBK Quoc Te Facebook
Featured in a video highlighting the achievements of Bach Khoa International Program students.
Watch VideoThanh Nien Facebook
Featured in social media posts highlighting academic achievements and early graduation.
View PostBK Quoc Te Facebook
Recognized for academic achievements and contributions to the university community.
View PostHo Chi Minh City, Vietnam
Abu Dhabi, UAE (PhD studies)