To stand where others cannot, you must endure what others will not.
Hello! I'm Truong-Phuc Nguyen, a CS student specializing in ML, DL, and NLP. My expertise spans across NLP tasks, such as: Information Retrieval, Question Answering, Text Generation, Summarization for Vietnamese text processing using both PLMs and LLMs.
Through research journey, I have successfully built some NLP demo systems demonstrating feasibility, including legal question-answering systems, clinical report summarization, and question generation tools for education. Currently, I am seeking advanced learning opportunities related to CS/NLP where I can leverage my knowledge of language modeling, text processing algorithms, and research experience.
Bachelor of Engineering in Computer Science (Gifted and Talented Programs)
September 2021 - June 2025
NLU Laboratory, Hung Yen University of Technology and Education, Vietnam
(to be updated ...)
Advisor: Assoc. Prof. Minh-Tien Nguyen
BioInfomatic Laboratory, Feng Chia University, Taiwan
Building a multi-agent system that extracts patient information through conversations, generates targeted follow-up questions to gather comprehensive patient data, and creates detailed pre-visit clinical reports. This streamlines the examination process, saving physician time and improving patient experience during medical consultations. The system is evaluated across three core tasks: Named Entity Recognition (NER), Question Generation (QG), and Summarization, achieving state-of-the-art results on both MTS-Dialog and CliniKnote benchmark datasets, demonstrating the superiority of multi-agent architectures over conventional approaches such as in-context learning and instruction-tuning in the medical domain.
Advisor: Prof. Fang-Rong Hsu
NLU Laboratory, Hung Yen University of Technology and Education
ViLegalLM comprises one representation model (135M) and two generation models (1.54B, 1.72B) specifically for Vietnamese legal text through continual pretraining on newly 16GB of high-quality legal documents. ViLegalLM achieves state-of-the-art performance across 10 benchmarks spanning four main tasks: Information Retrieval (IR), Question Answering (QA), Natural Language Inference (NLI), and Syllogism Reasoning, outperforming 7 state-of-the-art Vietnamese models and establishing new strong baselines for Vietnamese legal text processing. The project also contributes three large-scale synthetic training datasets to address the shortage of high-quality legal training data in Vietnam.
Advisor: Assoc. Prof. Minh-Tien Nguyen
NLU Laboratory, Hung Yen University of Technology and Education, Vietnam
Designed a framework combining multiple bi-encoders through query-specific confidence calculation, advanced dynamic weighting, and ensemble score fusion with cross-encoder reranker. Achieved 3rd place in Legal Information Retrieval task (F2-score: 0.8482, 7.51% improvement) and 2nd place in Legal Question Answering (97.56% accuracy) in ALQAC 2025 Competition. Paper accepted at 17th International Conference on Knowledge and System Engineering (KSE 2025).
Advisor: Assoc. Prof. Minh-Tien Nguyen
NLU Laboratory, Hung Yen University of Technology and Education, Vietnam
Built a demo legal question-answering system for Vietnamese, integrating information retrieval with answer extraction/generation optimized for the legal domain. IntelliChat outperforms GPT-3.5 and state-of-the-art open-source LLMs (~7B parameters) in both automatic and human evaluations, and is deployed online to enable Vietnamese citizens to independently access and understand legal documents.
Advisor: Assoc. Prof. Minh-Tien Nguyen
NLU Laboratory, Hung Yen University of Technology and Education, Vietnam
Developed a novel fine-tuning framework leveraging Question-Context-Answer relationships for enhancing legal information retrieval in low-resource settings. Average improvements of 3.9% and 4.8% in MAP@100. Published in Engineering Applications of Artificial Intelligence (WoS-SCIE, Q1, IF: 8.0).
Advisor: Assoc. Prof. Minh-Tien Nguyen
NLU Laboratory, Hung Yen University of Technology and Education, Vietnam
Pioneered Vietnamese Question-Answer Generation research in education domain by creating ViEduQA - the first comprehensive Vietnamese educational QAG dataset with 12,618 QA pairs across 319 lessons from 4 high school subjects. Published in SOICT 2024 (Springer CCIS).
Advisor: Assoc. Prof. Minh-Tien Nguyen
SmartChat - Smart AI Assistant for Vietnamese
Led comprehensive evaluation of large language models for production chatbot system, focusing on optimizing performance across multiple NLP tasks including document reranking, question rewriting, and content generation. Conducted systematic benchmarking of state-of-the-art models including GPT-4o-mini, Amazon Nova Lite, Amazon Nova Micro, and Amazon Nova Pro.
AI Assistant for Vietnam Ministry of Agriculture
Developed a comprehensive domain-specific chatbot system enabling intelligent question-answering capabilities over legal document collections and regulatory text corpora. Built innovative text-to-analytics functionality and designed a novel document chunking mechanism combining Depth-First Search (DFS) algorithms with advanced Regular Expression patterns.
Per-Title Encoding
Developed and optimized per-title encoding algorithms for video compression on the TV360 streaming platform. Implemented advanced analysis techniques to assess video complexity and dynamically adjust encoding parameters, achieving optimal balance between visual quality and file size.
Video Frame Interpolation
Designed and implemented advanced video frame interpolation systems to enhance motion smoothness for TV360 platform content delivery. Developed sophisticated interpolation algorithms using deep learning techniques to generate high-quality intermediate frames.
Video Quality Assessment for User Generated Content
Built comprehensive video quality assessment frameworks for evaluating and enhancing user-generated content on the TV360 platform using computer vision and machine learning techniques.
CS19TN, Hung Yen University of Technology and Education
Assisted in teaching Natural Language Processing course, guiding students through fundamental and advanced NLP concepts, helping with assignments and projects.
UTEHY-NLU Lab
Mentored students in NLP research projects, guiding them through literature review, experimental design, and paper writing.
UTEHY-NLU Lab
Guided students in deep learning fundamentals and applications, covering neural networks, CNNs, RNNs, and Transformers.
UTEHY-NLU Lab
Mentored students in machine learning concepts and practical applications, covering supervised and unsupervised learning algorithms.
July 2025
Top #2 in Legal Question Answering (0.9756 Accuracy) and Top #3 in Legal Information Retrieval (0.8482 F2-Score)
July 2024
Top #5 in Legal Question Answering task
November 2024
2021-2025
4 Academic Excellence & 8 Talented Program Scholarships - Consistently ranked #1 in CS program
June 2025
Research project: Research on building question answering system for Vietnamese legal documents
June 2025
Thesis title: A Study of Vietnamese Legal Question Answering with Pre-trained and Large Language Models
May 2025
Research project: Research on building question answering system for Vietnamese legal documents
March 2025
Research project: Research on building question answering system for Vietnamese legal documents
June 2024
Research project: Research on developing a student attendance system using facial recognition and detecting unusual behavior in the classroom
April 2023
February 2022
nguyentruongphuc_12421tn@utehy.edu.vn
Feel free to reach out for collaborations, research opportunities, or just to connect!