Ming Li
minglii [AT] umd.edu

I am a second-year Ph.D. student in Computer Science at the University of Maryland, advised by Prof. Tianyi Zhou. I began my academic journey in computer science with a Bachelor of Science from Xi’an Jiaotong University in 2020, followed by a Master of Science at Texas A&M University advised by Prof. Ruihong Huang in 2023. Besides, I spent 2 years at Shenzhen Institutes of Advanced Technology, Chinese Academy of Science, advised by Prof. Yu Qiao since 2019.
My research interests broadly lie in the areas of Machine Learning (ML), Natural Language Processing (NLP), and Large Language Models (LLM). More specifically, my recent research interests mainly lie in Post-training for LLMs, including: (i) Data Selection (Cherry LLM (IFD), Superfiltering); (ii) Data Synthesis (Mosaic-IT, Reflection-Tuning, Selective Reflection-Tuning, RuleR); (iii) Controllability (DEBATunE, RuleR); (iv) Interpretability (Layer_Gradient). I am also interested in and now exploring Vision-LLMs Finetuning, Agent, Efficiency, and Reasoning.
If you are looking for a highly motivated intern with a background in computer science and a passion for advancing AI technologies, I would be thrilled to have an opportunity to chat with you!
I am really interested in collaboration, feel free to drop me an email for any opportunity!
news
Jan 28, 2025 | I will join Microsoft (MSR) as a Research Internship this spring semester~ |
---|---|
Jan 22, 2025 | One paper was accepted by NAACL 2025! RuleR: Improving LLM Controllability by Rule-based Data Recycling |
Jan 22, 2025 | One paper was accepted by ICLR 2025! BenTo: Benchmark Task Reduction with In-Context Transferability |
Oct 31, 2024 | One paper was put on the arXiv: What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective, where we try to understand the layer-wise gradient behaviors when LLMs are finetuned on Fast vs. Slow Thinking. Repo: Layer_Gradient. |
May 23, 2024 | One paper was put on the arXiv: Mosaic IT: Enhancing Instruction Tuning with Data Mosaics, where we proposed an augmentation method for instruction tuning, which concurrently improves the LLM performances and lowers the training expenses. Repo: Mosaic-IT. |
May 16, 2024 | Three papers were accepted by ACL 2024! 1 Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning; 2 Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning; 3 Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements (DEBATunE). |
Mar 13, 2024 | One paper was accepted by NAACL 2024! From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning (Cherry LLM (IFD)) |
Feb 21, 2024 | I will join Adobe (based in San Jose) as a Research Scientist/Engineer Intern this Summer~ |
Oct 28, 2023 | One paper was accepted by Instruction Workshop @ NeurIPS 2023! Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning. |
Oct 07, 2023 | One paper was accepted by EMNLP 2023! PRCA: Fitting Black-Box Large Language Models for Retrieval Question Answering via Pluggable Reward-Driven Contextual Adapter. |
Sep 01, 2023 | I arrived at the University of Maryland, officially beginning my journey for a Ph.D. ✌️ |
Jun 01, 2023 | I obtained my Master’s in Computer Science at Texas A&M University. |
selected publications
- ACLSuperfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning2024
- ACLSelective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning2024
- ACLCan LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements2024
- NAACLFrom Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning2024
- NIPS WorkshopReflection-Tuning: Recycling Data for Better Instruction-TuningIn NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following , 2023