3DMM 2026

Overview

In the embodied AI era, ubiquitous sensors generate massive 3D multimodal data—LiDAR clouds, RGB-D video, dynamic meshes, and neural fields. Integrating text, audio, and images with 3D geometry empowers agents to physically interact with the world.

Logistics robots grasp objects using RGB recognition and point cloud pose estimation; assistants fuse language commands with scene geometry to execute complex tasks. Advancing from passive understanding to active perception, 3D multimedia analytics now drives autonomous systems, dexterous manipulation, and collaborative manufacturing.

Researchers have advanced 3D multimedia analytics across autonomous driving, robotic navigation, smart manufacturing, and logistics, where agents grasp, move, and inspect objects using 3D data. Beyond passive analysis, embodied systems demand active perception and interaction, pushing new challenges in sim-to-real transfer, affordance learning, and multimodal decision-making.

This workshop aims to:

Convene state-of-the-art research in 3D multimedia analysis
Address emerging challenges in multimodal 3D perception
Establish benchmarks for both classic and embodied 3D tasks
Showcase innovations in representation learning and interactive systems
Demonstrate real-world 3D multimedia applications
Introduce new datasets spanning static scenes to dynamic interactions

Submit Paper (Link TBD)

Call for Papers

We solicit original research and survey papers in (but not limited to) the following topics:

Generative Models
Generative Models for 3D Multimedia and 3D Multimedia Synthesis
Real-world Data
Generating 3D Multimedia from Real-world Data
Multimodal Analysis
3D Multimodal Analysis and Description
VR / AR
Multimedia Virtual/Augmented Reality
Search & Rec
3D Multimedia Search and Recommendation
3D Art & Culture
3D Multimedia Art, Entertainment and Culture
Mobile 3D
Mobile 3D Multimedia
Shape & Reconstruction
3D Shape Estimation and Reconstruction
Scene Understanding
3D Scene Understanding
Segmentation
3D Semantic Segmentation
Detection & Tracking
3D Object Detection and Tracking
Representation Learning
High-level Representation of 3D Multimedia Data
Robotics
3D Multimedia Data Understanding for Robotics
Embodied Interaction
Embodied 3D Scene Interaction and Manipulation

Organizing Committee

Shan An

Tianjin University, Tianjin, China

Email: anshan@tju.edu.cn

Dr. Shan An (FIET) received a Bachelor's degree in Automation Engineering from Tianjin University, China in 2007 and a Master's degree in Control Science from Shandong University, China, in 2010. He received the Ph.D. degree in computer science from Beihang University in 2022. He is a tenured associate professor of Tianjin University. He is an IET Fellow and BCS Fellow. He has authored or co-authored more than 60 papers in journals and conferences. He has more than 30 patents granted in China and 8 patents granted in the United States, Japan and Russia. His research interests include image segmentation and retrieval, visual SLAM in robotics and AR.

Kun Liu

JD Explore Academy, Beijing, China

Email: liukun167@jd.com

Dr. Kun Liu is a Senior Researcher in JD Explore Academy, China. He received the Ph.D. degree in computer science from the Beijing University of Posts and Telecommunications in 2021. His current research interests include Multimedia Large Language Model (MLLM) and AI Generated Content (AIGC). He has authored or co-authored more than 20 papers in top-tier conferences and journals in computer vision and multimedia. Dr. Liu has won 1st Place in Step Ordering Track of CVPR 2020 YouMakeUp VQA Challenge, 1st Place in General Anomaly Detection Track of ACM MM 2020 CitySCENE Anomaly Detection Challenge, 2nd in Part-level Action Parsing Track of ICCV 2021 DeeperAction Challenge, and 2nd prize in No Interaction Track of SAPIEN Open-Source Manipulation Skill Challenge.

Xuri Ge

Shandong University, Jinan, China

Email: xurigexmu@gmail.com

Dr. Xuri Ge is currently a tenure-track assistant professor in the school of artificial intelligence, Shandong University. He earned his PhD at the University of Glasgow (UK) and received M.S. degree from Xiamen University (China). His current research interests include computer vision, multimodal representation learning, and multimodal information retrieval. He has contributed to several leading conferences and journals, including NeurIPS, SIGIR, ACM MM, CIKM, ICME, ACM TIST, and IP&M, etc. He also serves as the PC member and reviewer for top-tier journals like IJCV, TKDE, TMM, TOIS, etc. He has organized the 3DMM Workshop in ICME24 and R^3AG workshop in SIGIR-AP2024.

Peng Dai

Noah's Ark Lab Canada, Toronto, ON, Canada

Email: peng.dai.ca@ieee.org

Dr. Peng Dai is currently a Principal Scientist at Noah's Ark Lab, Huawei Canada Inc. He received his B.Eng and M.Eng degrees in Electrical Engineering from Tianjin University in 2006 and 2008, respectively and PhD degree from School of Electrical & Electronic Engineering, Nanyang Technological University. Dr. Dai has authored more than 40 publications in top tier venues including TNNLS, KBS, etc. He has granted more than 20 patents. He has contributed to a number of high impact products. He has served as a program committee member for CVPR, ICCV, WACV, AAAI, and IJCAI. His research interests cover Computer Vision, Data Mining and Signal Processing, with special focus on practical applications.

Wu Liu

University of Science and Technology of China, Hefei, China

Email: liuwu@ustc.edu.cn

Dr. Wu Liu is a Professor in University of Science and Technology of China. His current research interests include multimedia analysis and search. He received his Ph.D. degree from the Institute of Computing Technology, Chinese Academy of Science in 2015. He has published more than 80 papers in prestigious conferences and journals in computer vision and multimedia. He received IEEE Trans. on Multimedia 2019 Prize Paper Award, IEEE Multimedia 2018 Best Paper Award, IEEE ICME 2016 Best Student Paper Award, and Chinese Academy of Sciences Outstanding Ph.D. Thesis Award in 2015, etc. Dr. Liu has also served as the Technical Program Co-Chairs of IEEE ICME 2022 and ACM MM Asia 2021, and the Area Chairs of ACM MM from 2019, AAAI from 2021, ACL 2022, CIKM 2021 and etc. Wu Liu also organized the workshops in ACM MM 2022&2021&2020 and IEEE ICCV 2021, three tutorials in ACM MM 2020, IEEE ICME 2019 and ACM MM Asia 2019. Wu Liu also won the ACM China Rising Star Award 2022, MIT TR 35 Asia Asia Pacific 2022, and IEEE ICME 2019 Outstanding Service Award.

Xinchen Liu

JD Explore Academy, Beijing, China

Email: xinchenliu@bupt.cn

Dr. Xinchen Liu is a Senior Researcher in JD Explore Academy, China. His current research interests include Multimedia Large Language Model (MLLM) and AI Generated Content (AIGC). He received his Ph.D. degree in Computer Science from the School of Computer Science, Beijing University of Posts and Telecommunications in 2018. He has published more than 40 papers in top-tier conferences and journals in computer vision and multimedia. He received IEEE Trans. on Multimedia 2019 Prize Paper Award, IEEE ICME 2016 Best Student Paper Award, and Outstanding Doctoral Dissertation Award of China Society of Image and Graphics in 2019. Dr. Liu has also served as the Area Chairs of IEEE ICME 2022&2024, Proceedings Co-Chair of ACM Multimedia Asia 2021. He also organized the workshops in ACM MM 2020-2024.

Chao Zhang

Tianjin University, Tianjin, China

Email: zhangchao64@tju.edu.cn

Mr. Chao Zhang received a Master's degree in Computer Science from Xi'an Jiaotong University, China, in 2013. He currently serves as an Associate Research Fellow at Tianjin University. He previously held the roles of Director of Algorithm at JD.com. He possesses over ten years of extensive experience in algorithm development within the industry, having worked at technology companies including Baidu, Tencent, and JD. During his industrial career, his contributions have been recognized with major industry awards, including: the 2016 Baidu Pride - Best Individual, the 2019 Tencent CFO Award, and the 2021 JD Health Technology Star. In the academic domain, he has authored or co-authored more than 10 sci papers. His current research interests focus on robot perception, robot action modeling, and AI for health application.

Antonios Gasteratos

Democritus University of Thrace, Greece

Email: agaster@pme.duth.gr

Prof. Antonios Gasteratos (FIET) is a Full Professor of Robotics, Mechatronics and Computer Vision at Democritus University of Thrace, Deputy Head of the Department of Production and Management Engineering and Director of the Laboratory of Robotics and Automation. He holds a MEng. and a PhD in Electrical and Computer Engineering, Democritus University of Thrace (1994 and 1999, respectively). During the past 15 years he has been principal investigator to several projects funded mostly by the European Commission, the European Space Agency, the Greek Secretariat for Research and Technology, Industry and other sources, which were mostly related to robotics and vision. He has published over 200 papers in peer reviewed journals and international conferences and written 3 textbooks in Greek. He is Subject Editor-in-Chief in Electronics Letters and Assoc. Editor at the Expert Systems with Applications, International Journal of Optomechatronics, International Journal of Advanced Robotics Systems. He is also evaluator of projects supported by the European Commission and other funding agencies, as well as reviewer in many international journals in the field of Computer Vision and Robotics. Antonios Gasteratos has been a member of programme committees of international conferences and chairman and co-chairman of international conferences and workshops.

If you have any questions, feel free to contact Shan An (anshan@tju.edu.cn)

Paper Submission	TBD
Notification	TBD
Camera-Ready	TBD
Workshop Date	July 2026

Overview

Call for Papers

Generative Models

Real-world Data

Multimodal Analysis

VR / AR

Search & Rec

3D Art & Culture

Mobile 3D

Shape & Reconstruction

Scene Understanding

Segmentation

Detection & Tracking

Representation Learning

Robotics

Embodied Interaction