Ming Li

I am a third-year Ph.D. student in Computer Science at the University of Maryland, advised by Prof. Tianyi Zhou. I began my academic journey in computer science with a Bachelor of Science from Xi’an Jiaotong University in 2020, followed by a Master of Science at Texas A&M University advised by Prof. Ruihong Huang in 2023. Besides, I spent 2 years at Shenzhen Institutes of Advanced Technology, Chinese Academy of Science, advised by Prof. Yu Qiao since 2019.

If you are looking for a highly motivated intern with a background in computer science and a passion for advancing AI technologies, I would be thrilled to have an opportunity to chat with you!

My research interests broadly lie in the areas of Machine Learning (ML), Natural Language Processing (NLP), and Large Language Models (LLM).
More specifically, my recent research interests mainly lie in Post-training for LLMs, including:
(i) Data Selection (Cherry LLM (IFD), Superfiltering);
(ii) Data Synthesis (Mosaic-IT, Reflection-Tuning, Selective Reflection-Tuning);
(iii) Controllability (DEBATunE, RuleR);
(iv) Interpretability (Layer_Gradient);
(v) Reasoning (MiP-Overthinking, Gradient_Unified).
I am also interested in and now exploring Vision-LLMs (TRIG, ColorBench), Agent (ATLaS), and RL.

I am really interested in collaboration, feel free to drop me an email for any opportunity!

news

Jul 07, 2025	One papers were accepted by COLM 2025! Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?
Jun 23, 2025	One paper was put on the arXiv: CaughtCheating: Is Your MLLM a Good Cheating Detective? Exploring the Boundary of Visual Perception and Reasoning. Repo: CaughtCheating.
Jun 18, 2025	Serving as an ACL ARR 2025 May Area Chair!
Jun 08, 2025	One paper was put on the arXiv: What makes Reasoning Models Different? Follow the Reasoning Leader for Efficient Decoding.
May 16, 2025	Three papers were accepted by ACL 2025! 1 What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective (Oral); 2 Mosaic-IT: Free Compositional Data Augmentation Improves Instruction Tuning; 3 ATLaS: Agent Tuning via Learning Critical Steps.
Apr 26, 2025	I will be attending NAACL2025 in Albuquerque, New Mexico, from Apr 29 to May 3, serving as a Volunteer Coordinator. Welcome to discuss with me!
Apr 20, 2025	I will join Amazon (Palo Alto) for AS Internship this summer~ Happy to connect!
Apr 14, 2025	One paper was put on the arXiv: How Instruction and Reasoning Data shape Post-Training: Data Quality through the Lens of Layer-wise Gradients. Repo: Gradient_Unified.
Apr 10, 2025	One paper was put on the arXiv: ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness. Repo: ColorBench.
Mar 05, 2025	One paper was accepted by Reasoning and Planning for LLMs @ ICLR2025! Towards visual text grounding of multimodal large language model.
Jan 28, 2025	I will join Microsoft (MSR) as a Research Internship this spring semester~
Jan 22, 2025	One paper was accepted by NAACL 2025! RuleR: Improving LLM Controllability by Rule-based Data Recycling
Jan 22, 2025	One paper was accepted by ICLR 2025! BenTo: Benchmark Task Reduction with In-Context Transferability
May 16, 2024	Three papers were accepted by ACL 2024! 1 Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning; 2 Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning; 3 Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements (DEBATunE).
Mar 13, 2024	One paper was accepted by NAACL 2024! From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning (Cherry LLM (IFD))
Feb 21, 2024	I will join Adobe (based in San Jose) as a Research Scientist/Engineer Intern this Summer~
Oct 28, 2023	One paper was accepted by Instruction Workshop @ NeurIPS 2023! Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning.
Oct 07, 2023	One paper was accepted by EMNLP 2023! PRCA: Fitting Black-Box Large Language Models for Retrieval Question Answering via Pluggable Reward-Driven Contextual Adapter.
Sep 01, 2023	I arrived at the University of Maryland, officially beginning my journey for a Ph.D. ✌️
Jun 01, 2023	I obtained my Master’s in Computer Science at Texas A&M University.