publications | Ming Li

Preprints and an automatically indexed view are also available on my Google Scholar page.

Conference

2026

Chenrui Fan, Yijun Liang, Shweta Bhardwaj, Kwesi Cobbina, Ming Li, Tianyi Zhou, “V-REX: Benchmarking Exploratory Visual Reasoning via Chain-of-Questions”, European Conference on Computer Vision (ECCV), 2026. PDF, CODE
Qitong Wang, Yijun Liang, Ming Li, Tianyi Zhou, Christopher Rasmussen, “History-Conditioned Spatio-Temporal Visual Token Pruning for Efficient Vision-Language Navigation”, 2026 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2026. PDF
Ming Li*, Xirui Li*, Tianyi Zhou, “Does Socialization Emerge in AI Agent Society? A Case Study of Moltbook”, ACM Conference on AI and Agentic Systems (CAIS), 2026. PDF, CODE
Ruoling Qi, Yirui Liu, Xuaner Wu, Xiangyu Wang, Ming Li, Chen Chen, Jian Chen, Yin Chen, Qizhen Weng, “Swift-SVD: Theoretical Optimality Meets Practical Efficiency in Low-Rank LLM Compression”, Forty-third International Conference on Machine Learning (ICML), 2026. PDF
Ming Li*, Chenrui Fan*, Yize Cheng*, Soheil Feizi, Tianyi Zhou, “Schoenfeld’s Anatomy of Mathematical Reasoning by Language Models”, Annual Meeting of the Association for Computational Linguistics (ACL) Oral, 2026. PDF, CODE
Ming Li, Yanhong Li, Ziyue Li, Tianyi Zhou, “How Instruction and Reasoning Data shape Post-Training: Data Quality through the Lens of Layer-wise Gradients”, Annual Meeting of the Association for Computational Linguistics (ACL), 2026. PDF, CODE
Ming Li, Pei Chen, Zhenhao Zhang, Tao Yang, Xinyang Zhang, Han Li, Tianyu Cao, Ming Zeng, Zhuofeng Wu, Meng Jiang, Huasheng Li, Lihong Li, Bing Yin, “Mitigating Lost in Multi-turn Conversation via Curriculum RL with Verifiable Accuracy and Abstention Rewards”, Annual Meeting of the Association for Computational Linguistics (ACL), 2026. PDF
Ming Li*, Han Chen*, Yunze Xiao, Jian Chen, Hong Jiao, Tianyi Zhou, “Can LLMs Estimate Student Struggles? Human-AI Difficulty Alignment with Proficiency Simulation for Item Difficulty Prediction”, Annual Meeting of the Association for Computational Linguistics (ACL) Findings, 2026. PDF, CODE
Shijie Zhou, Jihyung Kil, Ming Li, Jiuxiang Gu, Curtis Wigington, Rajiv Jain, Changyou Chen, Ruiyi Zhang, “Unveiling Inherent Visual Grounding in Multimodal LLMs for Text-Rich Images”, Annual Meeting of the Association for Computational Linguistics (ACL) Findings, 2026.
Yanhong Li, Ming Li, Karen Livescu, Jiawei Zhou, “On the Predictive Power of Representation Dispersion in Language Models”, The Fourteenth International Conference on Learning Representations (ICLR), 2026. PDF
Zhuochun Li, Yong Zhang, Ming Li, Yuelyu Ji, Yiming Zeng, Ning Cheng, Yun Zhu, Yanmeng Wang, Shaojun Wang, Jing Xiao, Daqing He, “Rethinking LLM-as-a-Judge: Representation-as-a-Judge with Small Language Models via Semantic Capacity Asymmetry”, The Fourteenth International Conference on Learning Representations (ICLR), 2026. PDF

2025

Yijun Liang*, Ming Li*, Chenrui Fan, Ziyue Li, Dang Nguyen, Kwesi Adu Cobbina, Shweta Bhardwaj, Jiuhai Chen, Fuxiao Liu, Tianyi Zhou, “ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness”, The Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track, 2025. PDF, CODE
Xiyao Wang, Zhengyuan Yang, Chao Feng, Yongyuan Liang, Yuhang Zhou, Xiaoyu Liu, Ziyi Zang, Ming Li, Chung-Ching Lin, Kevin Lin, Linjie Li, Furong Huang, Lijuan Wang, “ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs”, The Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS), 2025. PDF, CODE

Dawei Li, Yue Huang, Ming Li, Tianyi Zhou, Xiangliang Zhang, Huan Liu, “Generative Models for Synthetic Data: Transforming Data Mining in the GenAI Era”, The 34th ACM International Conference on Information and Knowledge Management (CIKM), 2025. PDF
Ming Li*, Nan Zhang*, Chenrui Fan*, Hong Jiao, Yanbin Fu, Sydney Peters, Qingshu Xu, Robert Lissitz, Tianyi Zhou, “Understanding the Thinking Process of Reasoning Models: A Perspective from Schoenfeld’s Episode Theory”, The 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025. PDF, CODE
Yuhang Zhou, Jing Zhu, Shengyi Qian, Zhuokai Zhao, Xiyao Wang, Xiaoyu Liu, Ming Li, Paiheng Xu, Wei Ai, Furong Huang, “DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data”, The 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP) Findings, 2025. PDF, CODE
Chenrui Fan*, Ming Li*, Lichao Sun, Tianyi Zhou, “Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?”, Second Conference on Language Modeling (COLM), 2025. PDF, CODE
Ming Li, Yanhong Li, Tianyi Zhou, “What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective”, The 63rd Annual Meeting of the Association for Computational Linguistics (ACL) Oral, 2025. PDF, CODE
Ming Li, Pei Chen, Chenguang Wang, Hongyu Zhao, Yijun Liang, Yupeng Hou, Fuxiao Liu, Tianyi Zhou, “Mosaic-IT: Cost-Free Compositional Data Synthesis for Instruction Tuning”, The 63rd Annual Meeting of the Association for Computational Linguistics (ACL) Findings, 2025. PDF, CODE
Zhixun Chen*, Ming Li*, Yuxuan Huang, Yali Du, Meng Fang, Tianyi Zhou, “ATLAS: Agent Tuning via Learning Critical Steps”, The 63rd Annual Meeting of the Association for Computational Linguistics (ACL) Findings, 2025. PDF
Ming Li*, Han Chen*, Chenguang Wang*, Dang Nguyen, Dianqi Li, Tianyi Zhou, “RuleR: Improving LLM Controllability by Rule-based Data Recycling”, Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL), 2025. PDF, CODE
Hongyu Zhao, Ming Li, Lichao Sun, Tianyi Zhou, “BenTo: Benchmark Reduction with In-Context Transferability”, The Thirteenth International Conference on Learning Representations (ICLR), 2025. PDF, CODE

2021 - 2024

Ming Li, Yong Zhang, Zhitao Li, Jiuhai Chen, Lichang Chen, Ning Cheng, Jianzong Wang, Tianyi Zhou, Jing Xiao, “From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning”, Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL), 2024. PDF, CODE
Ming Li, Yong Zhang, Shwai He, Zhitao Li, Hongyu Zhao, Jianzong Wang, Ning Cheng, Tianyi Zhou, “Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning”, The 62nd Annual Meeting of the Association for Computational Linguistics (ACL), 2024. PDF, CODE
Ming Li, Lichang Chen, Jiuhai Chen, Shwai He, Jiuxiang Gu, Tianyi Zhou, “Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning”, The 62nd Annual Meeting of the Association for Computational Linguistics (ACL) Findings, 2024. PDF, CODE
Ming Li, Jiuhai Chen, Lichang Chen, Tianyi Zhou, “Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements”, The 62nd Annual Meeting of the Association for Computational Linguistics (ACL) Findings, 2024. PDF, CODE
Haoyan Yang, Zhitao Li, Yong Zhang, Jianzong Wang, Ning Cheng, Ming Li, Jing Xiao, “PRCA: Fitting Black-Box Large Language Models for Retrieval Question Answering via Pluggable Reward-Driven Contextual Adapter”, The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023. PDF

Journal

Ming Li, Hong Jiao, Tianyi Zhou, Nan Zhang, Sydney Peters, Robert W. Lissitz, “Item Difficulty Modeling Using Fine-tuned Small and Large Language Models”, Educational and Psychological Measurement (EPM), 2025. PDF
Qitong Wang, Bin Fu, Ming Li, Junjun He, Xi Peng, Yu Qiao, “Region-aware arbitrary-shaped text detection with progressive fusion”, IEEE Transactions on Multimedia (IEEE TMM), 2022. PDF
Ming Li, Bin Fu, Han Chen, Junjun He, Yu Qiao, “Dual relation network for scene text recognition”, IEEE Transactions on Multimedia (TMM), 2022. PDF
Ming Li, Bin Fu, Zhengfu Zhang, Yu Qiao, “Character-aware sampling and rectification for scene text recognition”, IEEE Transactions on Multimedia (TMM), 2021. PDF