Qi Zhu

Table of Contents

Quick Links

Applied Scientist
Bedrock Core Science
AWS AI/ML Services & Infrastructure
Email: qi.zhu.ckc@gmail.com

Bio

Currently, I am an Applied Scientist at AWS Bedrock, working on data-driven optimization of large language models (LLMs). Over the last two years, I contributed to pioneering AI systems for structured data (GraphStorm, GraphRAG) utilizing structured knowledge for applications in retrieval-augmented generation (RAG), graph machine learning, and beyond.

I obtained my Ph.D. in Computer Science from University of Illinois at Urbana-Champaign advised by Prof. Jiawei Han, where I was a member of Data and Information Systems Laboratory (DAIS) and Data Mining Group. Here is my CV (outdated).

Research

My current and past work focuses on the following themes:

  1. Faithful Multi-Modal Optimization for small LLMs – Addressing the reasoning gap and hallucination bottlenecks in small LLMs. We investigate how to leverage real-world usage patterns and data-driven optimization to build highly calibrated, faithful, and reliable multi-modal models without relying on massive compute.
  2. LLMs with Structured Knowledge – Harnessing explicit and implicit data structures to to enhance LLM reasoning capabilities and mitigate hallucinations
  3. Graph Representation Learning – Representing objects in heterogenous text-attributed graph with heterogenous learning, and robust to distribution shift.

I. Faithful Multi-Modal Optimization for small LLMs

Depite being edge computable, small language models (SLMs) often suffer from reasoning gaps and hallucinations, which can significantly impact their performance and reliability in real-world applications. We applied various methods to improve the grounding, faithfulness and reliability of these models including test-time scaling, post-training etc.

  • Tabular Referencing Error: Addressing the pervasive "data referencing errors" (DREs) where compact models miscite, confuse, or omit table values despite parsing the structure correctly. Supported by our recent ACL work, we introduce specialized critic modules optimized via Reinforcement Learning with Verified Reward (RLVR) to significantly enhance the test-time scaling and faithfulness of 7B to 20B parameter models on data-intensive tasks.
    • When LLMs Read Tables Carelessly: Measuring and Reducing Data Referencing Errors, ACL 26 (Oral)
  • Mechanistic Interpretability and Retrieval Dynamics in Long-Context VLMs: Investigating the internal processing bottlenecks of Vision-Language Models handling dense, text-rich image documents. Supported by our recent work introducing VERA, we discover Visual Evidence Retrieval (VER) heads—a sparse, dynamic set of attention heads critical for locating visual cues during high-uncertainty reasoning steps. We leverage these mechanistic insights to design training-free, inference-time interventions that mitigate the visual-textual attention imbalance and significantly boost long-document comprehension.

II. LLMs with Structured Knowledge

We aim to make LLMs more efficient and resilient against hallucinations by harnessing structured knowledge. A key challenge lies in making the language model structure-aware while mitigating performance bottlenecks, such as the lost-in-the-middle phenomenon, To address this, we explore fine-tuning, and pre-training techniques on graph structured data. [SURGeLLM, ACL 2026] [Structured Knowledge for LLMs, KDD 2025]

  • Graph Retrieval Augmented Generation: We propose HYBGRAG, an agentic system for hybrid question answering over semi-structured knowledge bases. Unlike prior RAG systems that handle only textual or relational information, HYBGRAG synergizes both through a retriever bank with adaptive module selection and a critic module that provides corrective feedback for iterative refinement. This structure-aware approach addresses the lost-in-the-middle phenomenon by precisely routing questions to appropriate retrieval modules and self-correcting extraction errors.

III. Graph Representation Learning

My research aims to make graph representation learning adapt to distribution shift and data heterogeneity.

Awards

  • 2020 Amazon AWS Machine Learning Research Award
  • 2018 ACM WWW Best Poster Honorable Mention