3DS-VLA: A 3D Spatial-Aware Vision Language Action Model for Robust Multi-Task Manipulation
            Xiaoqi Li, Liang Heng, Jiaming Liu, Yan Shen, Chenyang Gu, Zhuoyang Liu, Hao Chen, Nuowei Han, Renrui Zhang, Hao Tang, Shanghang Zhang, Hao Dong
          
        SR3D: Unleashing Single-view 3D Reconstruction for Transparent and Specular Object Grasping
            Mingxu Zhang*, Xiaoqi Li*, Jiahui Xu, Kaichen Zhou, Hojin Bae, Yan Shen, Chuyan Xiong, Hao Dong
          
        RwoR: Generating Robot Demonstrations from Human Hand Collection for Policy Learning without Robot
            Liang Heng*, Xiaoqi Li*, Shangqing Mao, Jiaming Liu, Ruolin Liu, Jingli Wei, Yu-Kai Wang, Jia Yueru, Chenyang Gu, Rui Zhao, Shanghang Zhang, Hao Dong
          
        CrayonRobo: Object-Centric Prompt-Driven Vision-Language-Action Model for Robotic Manipulation
            Xiaoqi Li, Lingyun Xu, Mingxu Zhang, Jiaming Liu, Yan Shen, Iaroslav Ponomarenko, Jiahui Xu, Liang Heng, Siyuan Huang, Shanghang Zhang, Hao Dong
          
        3D Weakly Supervised Visual Grounding at Category and Instance Levels
            Xiaoqi Li, Jiaming Liu, Yandong Guo, Hao Dong, Yang Liu
          
        Spatialbot: Precise spatial understanding with vision language models
            Wenxiao Cai, Iaroslav Ponomarenko, Jianhao Yuan,  Xiaoqi Li, Wankou Yang, Hao Dong, Bo Zhao
          
        Naturalvlm: Leveraging fine-grained natural language for affordance-guided visual manipulation
            Ran Xu, Yan Shen,  Xiaoqi Li, Ruihai Wu, Hao Dong
          
        Autonomous interactive correction MLLM for robust robotic manipulation
            Chuyan Xiong*, Chengyu Shen*,  Xiaoqi Li*, Kaichen Zhou, Jiaming Liu, Ruiping Wang, Hao Dong
          
        ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models
            Siyuan Huang, Iaroslav Ponomarenko, Zhengkai Jiang, Xiaoqi Li, Xiaobin Hu, Peng Gao, Hongsheng Li, Hao Dong
          
        ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation
            Xiaoqi Li, Mingxu Zhang, Yiran Geng, Haoran Geng, Yuxing Long, Yan Shen, Renrui Zhang, Jiaming Liu, Hao Dong
          
        RGBManip: Monocular Image-based Robotic Manipulation through Active Object Pose Estimation
            Boshi An, Yiran Geng, Kai Chen, Xiaoqi Li, Qi Dou, Hao Dong
          
        Discuss Before Moving: Visual Language Navigation via Multi-expert Discussions
            Yuxing Long, Xiaoqi Li, Wenzhe Cai, Hao Dong
          
        Find What You Want: Learning Demand-conditioned Object Attribute Space for Demand-driven Navigation
            Hongcheng Wang, Andy Guan Hong Chen, Xiaoqi Li, Mingdong Wu, Hao Dong
          
Efficient Meta-Tuning for Content-aware Neural Video Delivery
            Xiaoqi Li, Jiaming Liu, Shizun Wang, Cheng Lyu, Ming Lu, Yurong Chen, Anbang Yao, Yandong Guo, Shanghang Zhang
          
Adaptive Patch Exiting for Scalable Single Image Super-Resolution
            Shizun Wang, Jiaming Liu, Kaixin Chen, Xiaoqi Li, Ming Lu, Yandong Guo
          
        Overfitting the Data: Compact Neural Video Delivery via Content-aware Feature Modulation
           Jiaming Liu, Ming Lu, Kaixin Chen,  Xiaoqi Li, Shizun Wang, Zhaoqing Wang, Enhua Wu, Yurong Chen, Chuang Zhang, Ming Wu
          
Last Updated on Aug, 2025