publications

publication list

Preprints and an automatically indexed view are also available on my Google Scholar page.

Conference

2026

  1. Ming Li*, Xirui Li*, Tianyi Zhou, “Does Socialization Emerge in AI Agent Society? A Case Study of Moltbook”, ACM Conference on AI and Agentic Systems (CAIS), 2026. PDF, CODE

  2. Ruoling Qi, Yirui Liu, Xuaner Wu, Xiangyu Wang, Ming Li, Chen Chen, Jian Chen, Yin Chen, Qizhen Weng, “Swift-SVD: Theoretical Optimality Meets Practical Efficiency in Low-Rank LLM Compression”, Forty-third International Conference on Machine Learning (ICML), 2026. PDF

  3. Ming Li*, Chenrui Fan*, Yize Cheng*, Soheil Feizi, Tianyi Zhou, “Schoenfeld’s Anatomy of Mathematical Reasoning by Language Models”, Annual Meeting of the Association for Computational Linguistics (ACL) Oral, Award Nomination, 2026. PDF, CODE

  4. Ming Li, Yanhong Li, Ziyue Li, Tianyi Zhou, “How Instruction and Reasoning Data shape Post-Training: Data Quality through the Lens of Layer-wise Gradients”, Annual Meeting of the Association for Computational Linguistics (ACL), 2026. PDF, CODE

  5. Ming Li, Pei Chen, Zhenhao Zhang, Tao Yang, Xinyang Zhang, Han Li, Tianyu Cao, Ming Zeng, Zhuofeng Wu, Meng Jiang, Huasheng Li, Lihong Li, Bing Yin, “Mitigating Lost in Multi-turn Conversation via Curriculum RL with Verifiable Accuracy and Abstention Rewards”, Annual Meeting of the Association for Computational Linguistics (ACL), 2026. PDF

  6. Ming Li*, Han Chen*, Yunze Xiao, Jian Chen, Hong Jiao, Tianyi Zhou, “Can LLMs Estimate Student Struggles? Human-AI Difficulty Alignment with Proficiency Simulation for Item Difficulty Prediction”, Annual Meeting of the Association for Computational Linguistics (ACL) Findings, 2026. PDF, CODE

  7. Shijie Zhou, Jihyung Kil, Ming Li, Jiuxiang Gu, Curtis Wigington, Rajiv Jain, Changyou Chen, Ruiyi Zhang, “Unveiling Inherent Visual Grounding in Multimodal LLMs for Text-Rich Images”, Annual Meeting of the Association for Computational Linguistics (ACL) Findings, 2026.

  8. Yanhong Li, Ming Li, Karen Livescu, Jiawei Zhou, “On the Predictive Power of Representation Dispersion in Language Models”, The Fourteenth International Conference on Learning Representations (ICLR), 2026. PDF

  9. Zhuochun Li, Yong Zhang, Ming Li, Yuelyu Ji, Yiming Zeng, Ning Cheng, Yun Zhu, Yanmeng Wang, Shaojun Wang, Jing Xiao, Daqing He, “Rethinking LLM-as-a-Judge: Representation-as-a-Judge with Small Language Models via Semantic Capacity Asymmetry”, The Fourteenth International Conference on Learning Representations (ICLR), 2026. PDF

2025

  1. Yijun Liang*, Ming Li*, Chenrui Fan, Ziyue Li, Dang Nguyen, Kwesi Adu Cobbina, Shweta Bhardwaj, Jiuhai Chen, Fuxiao Liu, Tianyi Zhou, “ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness”, The Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track, 2025. PDF, CODE

  2. Xiyao Wang, Zhengyuan Yang, Chao Feng, Yongyuan Liang, Yuhang Zhou, Xiaoyu Liu, Ziyi Zang, Ming Li, Chung-Ching Lin, Kevin Lin, Linjie Li, Furong Huang, Lijuan Wang, “ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs”, The Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS), 2025. PDF, CODE

  1. Dawei Li, Yue Huang, Ming Li, Tianyi Zhou, Xiangliang Zhang, Huan Liu, “Generative Models for Synthetic Data: Transforming Data Mining in the GenAI Era”, The 34th ACM International Conference on Information and Knowledge Management (CIKM), 2025. PDF

  2. Ming Li*, Nan Zhang*, Chenrui Fan*, Hong Jiao, Yanbin Fu, Sydney Peters, Qingshu Xu, Robert Lissitz, Tianyi Zhou, “Understanding the Thinking Process of Reasoning Models: A Perspective from Schoenfeld’s Episode Theory”, The 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025. PDF, CODE

  3. Yuhang Zhou, Jing Zhu, Shengyi Qian, Zhuokai Zhao, Xiyao Wang, Xiaoyu Liu, Ming Li, Paiheng Xu, Wei Ai, Furong Huang, “DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data”, The 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP) Findings, 2025. PDF, CODE

  4. Chenrui Fan*, Ming Li*, Lichao Sun, Tianyi Zhou, “Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?”, Second Conference on Language Modeling (COLM), 2025. PDF, CODE

  5. Ming Li, Yanhong Li, Tianyi Zhou, “What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective”, The 63rd Annual Meeting of the Association for Computational Linguistics (ACL) Oral, 2025. PDF, CODE

  6. Ming Li, Pei Chen, Chenguang Wang, Hongyu Zhao, Yijun Liang, Yupeng Hou, Fuxiao Liu, Tianyi Zhou, “Mosaic-IT: Cost-Free Compositional Data Synthesis for Instruction Tuning”, The 63rd Annual Meeting of the Association for Computational Linguistics (ACL) Findings, 2025. PDF, CODE

  7. Zhixun Chen*, Ming Li*, Yuxuan Huang, Yali Du, Meng Fang, Tianyi Zhou, “ATLAS: Agent Tuning via Learning Critical Steps”, The 63rd Annual Meeting of the Association for Computational Linguistics (ACL) Findings, 2025. PDF

  8. Ming Li*, Han Chen*, Chenguang Wang*, Dang Nguyen, Dianqi Li, Tianyi Zhou, “RuleR: Improving LLM Controllability by Rule-based Data Recycling”, Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL), 2025. PDF, CODE

  9. Hongyu Zhao, Ming Li, Lichao Sun, Tianyi Zhou, “BenTo: Benchmark Reduction with In-Context Transferability”, The Thirteenth International Conference on Learning Representations (ICLR), 2025. PDF, CODE

2021 - 2024

  1. Ming Li, Yong Zhang, Zhitao Li, Jiuhai Chen, Lichang Chen, Ning Cheng, Jianzong Wang, Tianyi Zhou, Jing Xiao, “From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning”, Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL), 2024. PDF, CODE

  2. Ming Li, Yong Zhang, Shwai He, Zhitao Li, Hongyu Zhao, Jianzong Wang, Ning Cheng, Tianyi Zhou, “Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning”, The 62nd Annual Meeting of the Association for Computational Linguistics (ACL), 2024. PDF, CODE

  3. Ming Li, Lichang Chen, Jiuhai Chen, Shwai He, Jiuxiang Gu, Tianyi Zhou, “Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning”, The 62nd Annual Meeting of the Association for Computational Linguistics (ACL) Findings, 2024. PDF, CODE

  4. Ming Li, Jiuhai Chen, Lichang Chen, Tianyi Zhou, “Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements”, The 62nd Annual Meeting of the Association for Computational Linguistics (ACL) Findings, 2024. PDF, CODE

  5. Haoyan Yang, Zhitao Li, Yong Zhang, Jianzong Wang, Ning Cheng, Ming Li, Jing Xiao, “PRCA: Fitting Black-Box Large Language Models for Retrieval Question Answering via Pluggable Reward-Driven Contextual Adapter”, The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023. PDF

Journal

  1. Ming Li, Hong Jiao, Tianyi Zhou, Nan Zhang, Sydney Peters, Robert W. Lissitz, “Item Difficulty Modeling Using Fine-tuned Small and Large Language Models”, Educational and Psychological Measurement, 2025. PDF

  2. Qitong Wang, Bin Fu, Ming Li, Junjun He, Xi Peng, Yu Qiao, “Region-aware arbitrary-shaped text detection with progressive fusion”, IEEE Transactions on Multimedia, 2022. PDF

  3. Ming Li, Bin Fu, Han Chen, Junjun He, Yu Qiao, “Dual relation network for scene text recognition”, IEEE Transactions on Multimedia, 2022. PDF

  4. Ming Li, Bin Fu, Zhengfu Zhang, Yu Qiao, “Character-aware sampling and rectification for scene text recognition”, IEEE Transactions on Multimedia, 2021. PDF