About Me

I am currently a second-year Masterโ€™s student in Computer Science at Northeastern University, focusing on Retrieval-Augmented Generation. Google Scholar:

Now I am engaged in research internships at NEUIR Lab under the guidance of Associate Professor Zhenghao Liu, as well as at THUNLP, supervised by Yukun Yan.

๐Ÿ”Ž Research Interests

My research focuses on agentic systems that integrate external knowledge, efficient reasoning, and scalable infrastructure. These span:

๐Ÿ‘€ I am looking for a Ph.D. position starting in Fall 2027 and would love to explore potential collaborations. Letโ€™s connect!

๐Ÿ”ฅ News

  • 2026.04: ย ๐ŸŽ‰ Our paper EigentSearch-Q+ is accepted by ACM CAIS 2026 Demos!
  • 2025.08: ย ๐ŸŽ‰ We released UltraRAG 2.0 , an low-code framework for building complex RAG systems!
  • 2025.05: ย ๐ŸŽ‰ Our paper RankCoT is accepted by ACL 2025!

๐Ÿ“ Publications

* indicates equal contribution, and โ€  indicates corresponding author.

Arxiv Print
sym

When Knowledge Is Not Free: Cost-Aware Evidence Selection in Retrieval-Augmented Generation

Mingyan Wu*, Han Yang*, Omer Ben-Porat, Yftah Ziserโ€ 

๐Ÿ“ƒPaper | ๐Ÿ“„PDF | GitHub stars

  • This work introduces cost-aware RAG, a setting where retrieved evidence is assigned accesscost tiers and systems must answer under an explicit evidence-access budget. We instantiate this setting by augmenting MS MARCO v2.1 with access-friction tiers and evaluate budgeted evidence selection across generaldomain and domain-specific QA benchmarks.
Arxiv Print
sym

Reasoning Compression with Mixed-Policy Distillation

Han Yang*, Mingyan Wu*, Bailan He, Zeyu Cao, Sikuan Yan, Kevin Qinghong Lin, Zifeng Ding*โ€ 

๐Ÿ“ƒPaper | ๐Ÿ“„PDF

  • This work proposes Mixed-Policy Distillation (MPD), a reasoning compression framework that transfers concise reasoning behavior from a larger-sized teacher to a smaller student by distilling teacher-compressed student trajectories. This preserves student-policy exploration while injecting teacher-guided compression.
ACM CAIS 2026 Demos
sym

EigentSearch-Q+: Enhancing Deep Research Agents with Structured Reasoning Tools

Boer Zhang, Mingyan Wu, Dongzhuoran Zhou, Yuqicheng Zhu, Wendong Fan, Puzhen Zhang, Zifeng Ding, Guohao Li, Yuan Heโ€ 

๐Ÿ“ƒPaper | ๐Ÿ“„PDF | GitHub stars

  • This work introduces Q+, a set of query and evidence processing tools that make web search more deliberate by guiding query planning, monitoring search progress, and extracting evidence from long web snapshots.
ACL 2025
sym

RankCoT: Refining Knowledge for Retrieval-Augmented Generation through Ranking Chain-of-Thoughts

Mingyan Wu, Zhenghao Liuโ€ ,Yukun Yanโ€ , Xinze Li, Shi Yu, Zheni Zeng, Yu Gu, Ge Yu

๐Ÿ“ƒPaper | ๐Ÿ“„PDF | GitHub stars

  • This work leverages the strengths of both ranking and summarization to effectively refine the knowledge from retrieval results, thereby aiding LLMs in generating more accurate responses.
Arxiv Print
sym

Finding What Matters: Anchoring Context Knowledge with Evolving Indices for Iterative Retrieval

Mingyan Wu*, Zhenghao Liu*โ€ , Xinze Li, Yuqing Lan, Yukun Yanโ€ , Shuo Wang, Cheng Yang, Minghe Yu, Zheni Zeng, Maosong Sun

๐Ÿ“ƒPaper | ๐Ÿ“„PDF | GitHub stars

  • This work proposes a Knowledge Anchoring framework for Iterative Retrieval that anchors knowledge within retrieved knowledge to guide LLMs to locate the key information during iterative retrieval and answer generation.

๐Ÿ“– Educations

  • 2024.09 - now, M.S. School of Computer Science and Engineering, Northeastern University
  • 2020.09 - 2024.07, B.S. School of Computer Science, Yangtze University

๐Ÿ’ป Internships

๐ŸŽผ Amateur

  • Dancing ๐Ÿ’ƒ: Iโ€™ve been danced for about nine years. My favorite dance styles are jazz and hiphop. And I also used to be a part-time dance teacher.
  • Guitar ๐ŸŽธ: I am a beginner of guitar and I usually play folk guitar.
  • Swimming ๐ŸŠ: Iโ€™ve recently got into swimming and would like to learn more about it.

    If you share these interests, I would be glad to connect and grow together ๐Ÿ“ž.