Segment unknown scene with unsupervised learning using UNet diffusion features with DiffCut


Segment unknown scene with unsupervised learning using UNet diffusion features with DiffCut

Zero-Shot Image Segmentation via Recursive Normalized Cut on Diffusion Features
arXiv paper abstract
arXiv PDF paper
Project page

Foundation models have emerged as powerful tools across various domains including language, vision, and multimodal tasks.

While prior works have addressed unsupervised image segmentation, they significantly lag behind supervised models.

… use a diffusion UNet encoder as … vision encoder and introduce DiffCut, an unsupervised zero-shot segmentation method that solely harnesses the output features from the final self-attention block.

… demonstrate that the utilization of these diffusion features in a graph based segmentation algorithm, significantly outperforms … state-of-the-art methods on zero-shot segmentation.

… leverage a recursive Normalized Cut algorithm that softly regulates the granularity of detected objects and produces … segmentation maps that … capture intricate image details.

… work highlights the remarkably accurate semantic knowledge embedded within diffusion UNet encoders that could then serve as foundation vision encoders for downstream tasks …

Stay up to date. Subscribe to my posts
Web site with my other posts by category


Photo by Charles Etoroma on Unsplash



AI News Clips by Morris Lee: News to help your R&D

A computer vision consultant in artificial intelligence and related hitech technologies 37+ years. Am innovator with 66+ patents and ready to help a firm's R&D.