A joint research team from Zhejiang University, ByteDance Seed, and Beijing Jiaotong University has introduced SpatialTree, a novel framework accepted at CVPR 2026 that systematically redefines how multimodal large language models (MLLMs) handle spatial intelligence.While today's MLLMs can describe images and understand video, true spatial understanding — judging distance, estimating size, understanding multi-view relationships, and planning nav…
This story is only covered by news sources that have yet to be evaluated by the independent media monitoring agencies we use to assess the quality and reliability of news outlets on our platform. Learn more here.